NLP – AI SCKOOL

Jacob Andreas and Brett McGuire recognized the winners of the Edgerton Award

The AI Sckool — Fri, 17 Apr 2026 18:55:15 +0000

Associate Professor at MIT Jakub Andreas Faculty of Electrical Engineering and Computer Science [EECS] and associate professor at MIT Brett McGuire from the Department of Chemistry have been named winners of the 2026 Harold E. Edgerton Award for Faculty Achievement. Established in 1982 as a tribute to the great and enduring support provided by Institute Professor Emeritus Harold E. Edgerton to junior faculty members, the award is given annually to recognize exceptional distinction in teaching, research and service.

“The Department of Chemistry is extremely delighted to recognize Brett for his science that changed the way we think about carbon in space,” says Class of 1942 Chemistry Professor and Department Chair Matthew D. Shoulders. “Brett’s lab combines laboratory spectroscopy, radio astronomy, and sophisticated signal analysis methods to extract definitive molecular fingerprints from extremely poor data. His discovery of polycyclic aromatic hydrocarbons in the cold interstellar medium opened a powerful new window into astrochemistry. Moreover, Brett invents creative and unique tools that make these types of discoveries possible.”

“Jacob Andreas represents the best of MIT EECS,” says Asu Ozdaglar, head of EECS. “He is an innovative researcher whose work combines computational and linguistic approaches to build the foundations of language learning. He is an extraordinary educator who has brought these critical ideas to our core classes in natural language processing and machine learning. His ability to combine fundamentals of theory with real-world impact while deepening the social and ethical dimensions of computer science makes him truly deserving of the Edgerton Faculty Achievement Award.”

Andreas joined the MIT faculty in July 2019 and is affiliated with the Computer Science and Artificial Intelligence Laboratory. He deals with natural language processing (NLP) and, more broadly, artificial intelligence. Its goal is to understand the computational foundations of language learning and build wise systems that can learn under human supervision. Andreas has received, among others, the Samsung AI Researcher of the Year Award, MIT Kolokotrones and Junior Bose Teaching Awards, the 2024 Sloan Research Fellow Award, and paper awards from the National Accreditation Agency for Clinical Laboratory Sciences, the International Conference on Machine Learning, and the Association for Computational Linguistics.

Andreas earned a BA from Columbia University, an MA from the University of Cambridge (where he studied as a Churchill Scholar), and a PhD in Natural Language Processing from the University of California, Berkeley. His work on natural language processing addressed thorny issues related to the capability gap between humans and computers. “A hallmark of the use of human language is our ability to generalize compositions,” explains Antonio Torralba, professor at Delta Electronics and head of the department of Artificial Intelligence and Decision Making at the Faculty of EECS. “Many of the fundamental challenges of natural language processing are being solved by simply training larger and larger neural models, but this kind of compositional generalization remains a persistent difficulty, and without the ability to generalize compositionally, the deep learning toolkit will never be robust enough to handle the most demanding real-world NLP tasks. Jacob’s work on compositional modeling draws new connections between NLP and work in computer vision and physics that aims to model systems governed by symmetries and other structures algebraic, and using them they were able to build NLP models that exhibit a range of new, human-like language acquisition behaviors, including one-shot word learning, learning through mutual exclusivity constraints, and learning grammar rules in extremely low-resource settings.”

Within EECS, Andreas has developed a number of advanced courses in natural language processing, as well as new exercises designed to challenge students to confront important social and ethical issues related to the implementation of machine learning. “Jacob has played a leading role in completely modernizing and expanding our natural language processing course offerings,” says award nominee Leslie Pack Kaelbling, Panasonic Professor in the Department of EECS. “He led the development of the modern two-course sequence, which is the cornerstone of the new AI+D [artificial intelligence and decision-making] a field of study that routinely enrolls several hundred students each semester. His knowledge of the field is broad and deep, and his classes integrate classical structural understanding of language with cutting-edge learning-based approaches. He put MIT EECS on the world map as the place to learn natural language at every level.”

Brett McGuire joined the MIT faculty in 2020 and was promoted to associate professor in 2025. His research operates at the intersection of physical chemistry, molecular spectroscopy, and observational astrophysics, where he seeks to discover how the chemical building blocks of life evolve with evolution and help shape the birth of stars and planets. McGuire, a former Jansky Fellow and then Hubble Postdoctoral Fellow at the National Radio Astronomy Observatory, McGuire holds a BS in chemistry from the University of Illinois and a PhD in physical chemistry from Caltech. His honors include the 2026 Sloan Fellowship, the Beckman Young Investigator Award, the Helen B. Warner Award in Astronomy and the MIT Award for Teaching with Digital Technology.

The faculty who nominated McGuire for this award praised his extraordinary public communications, his immediate willingness to teach Class 5.111 (Principles of Chemical Sciences), a General Institute Requirement (GIR) course with 150-500 students, and his service to both the MIT and astrochemical communities.

“Brett is at the very top of his age group of astrochemical scientists with his discovery of fused carbon ring compounds in the cold region of the ISM [interstellar medium]an observation that enables the incorporation of carbon into planets,” says Sylvia Ceyer, the John C. Sheehan Professor of Chemistry, in her announcement of the nomination. “His extensive involvement in service-oriented activities in the astrochemical/physics community is highly unusual for a young scientist and is a testament to the value that the astronomical community places on his wisdom and judgment. His phenomenal organizational skills have made his contributions to MIT’s graduate admissions protocols and seminar administration the envy of the faculty. A Most importantly, Brett is a wonderful teacher who cares deeply about students’ understanding and success not only in the classroom, but also in their future endeavors.”

“As an adjunct professor, Brett volunteered to teach 5.111, a large GIR course with 150-500 students, and received some of the best teaching ratings of any faculty member who taught the course,” says Mei Hong, the David A. Leighty Professor of Chemistry. “He has a natural talent for explaining abstract concepts in physical chemistry in an engaging way. His slides, which he prepared from scratch rather than modifying previous years’ material from other professors, are clear, and… the combination of clear explanation and humor generated enormous enthusiasm and interest in chemistry among students.”

Course evaluations of McGuire’s courses praised his humor, clarity of explanation and ability to turn a lecture into a “science demonstration.” “I didn’t feel that kind of desire to have a deeper understanding of the subject beyond just a grade [in some time]”says one student. “Brett definitely made me love learning.”

“Brett is an outstanding faculty member who strives to support student learning and success,” says Jennifer Weisman, associate director of academic programs in chemistry. “He is thoughtful, caring and goes out of his way to help his colleagues, students and employees.”

“I am thrilled to have been selected for the Edgerton Award this year,” says McGuire. “The award is nominally for teaching, research, and service; MIT, and the chemistry department in particular, is an incredible place to learn and grow in all of these areas. I am incredibly grateful for the mentorship, enthusiasm, and support I have received from my colleagues, students both in the lab and in the classroom, and the MIT community during my time here. I look forward to many more years of exciting discovery with this one-of-a-kind community.”

A seed of the MIT-IBM Watson AI lab signaling: empowering early-career faculty

The AI Sckool — Tue, 17 Mar 2026 23:54:04 +0000

The early years of faculty members’ careers are a formative and invigorating time in which a solid foundation can be established that will aid determine the trajectory of researchers’ studies. This includes building a research team that requires inventive ideas and direction, inventive collaborators, and reliable resources.

For a group of MIT faculty working in and around artificial intelligence, early collaborations with the MIT-IBM Watson AI Lab on projects have played an critical role in helping to promote ambitious research directions and shape prolific research groups.

Building dynamics

“The MIT-IBM Watson AI Lab has played an extremely important role in my success, especially when I first started out,” says Jacob Andreas—an associate professor in the Department of Electrical Engineering and Computer Science (EECS), a member of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), and a researcher at the MIT-IBM Watson AI Lab—who studies natural language processing (NLP). Shortly after joining MIT, Andreas began his first major project at the MIT-IBM Watson AI Lab, working on language representation and structured data augmentation methods for low-resource languages. “It really allowed me to get the lab up and running and start recruiting students.”

Andreas notes that this occurred at a “pivotal moment” when the field of NLP was undergoing significant shifts toward understanding language models – a much more computationally intensive task available through the MIT-IBM Watson AI Lab. “I feel like that’s the kind of work we did on this project [first] project and in collaboration with all of our people on the IBM side, he was very helpful in figuring out how to make this change happen.” Moreover, Andreas’ group was able to implement multi-year projects on pre-training, reinforcement learning and calibration for reliable responses thanks to the computational resources and expertise of the MIT-IBM community.

Several other faculty members also found timely participation in the MIT-IBM Watson AI Lab to be very beneficial. “Having both the intellectual support and the opportunity to leverage some of the computational resources available at MIT-IBM has completely changed me and is extremely important to my research program,” says Yoon Kim — an associate professor in EECS, CSAIL and a researcher in the MIT-IBM Watson AI Lab — who has also noticed a change in the trajectory of his research field. Before joining MIT, Kim met his future collaborators during a postdoctoral fellowship at MIT-IBM, where he developed a neurosymbolic model; Currently, Kim’s team is developing methods to improve the capabilities and performance of the Huge Language Model (LLM).

One of the factors that led to his group’s success, which he points to, is a velvety research process involving intellectual partners. This enabled his MIT-IBM team to bid for the project, conduct large-scale experiments, identify bottlenecks, validate techniques, and adapt as necessary to develop state-of-the-art methods for potential real-world applications. “It’s an impetus for new ideas, and that’s what I think is unique about this relationship,” Kim says.

Pooling expertise

The nature of the MIT-IBM Watson AI Lab is such that it not only brings together artificial intelligence researchers to accelerate research, but also bridges work across disciplines. Lab researcher and MIT associate professor in EECS and CSAIL Justin Solomon describes his research group as growing up in the lab and collaboration as “crucial… from its inception to the present.” Solomon’s research team focuses on theoretically oriented geometric problems related to computer graphics, vision and machine learning.

Solomon credits the MIT-IBM collaboration with broadening his skills and the applications of his group’s work — a sentiment also shared by lab researchers Chuchu Fan, associate professor of aeronautics and astronautics and member of the Information and Decision Systems Laboratory, and Faez Ahmed, associate professor of mechanical engineering. “They [IBM] we’re able to translate some of these really messy engineering problems into the kind of math resources that our team can work on and close the loop,” Solomon says. For Solomon, that includes combining distinct AI models that have been trained on different datasets for distinct tasks. “I think these are really exciting spaces,” he says.

“I think these are career-starting projects [with the MIT-IBM Watson AI Lab] have largely shaped my own research agenda,” says Fan, whose research combines robotics, control theory, and safety-critical systems. Like Kim, Solomon, and Andreas, Fan and Ahmed began collaborative projects in their first year at MIT. Constraints and optimization govern the problems Fan and Ahmed tackle, and therefore require deep domain knowledge outside of artificial intelligence.

Collaborating with the MIT-IBM Watson AI Lab allowed Fan’s group to combine formal methods with natural language processing, which she says allowed the team to move from developing autoregressive task and motion planning for robots to creating LLM-based agents for travel planning, decision making, and verification. “This work was the first attempt to use LLM to translate any natural language in any form into a specification that a robot could understand and execute. This is something I am very proud of, but which was very difficult at the time,” says Fan. What’s more, through the collaborative investigation, her team was able to improve the LLM’s reasoning, which “would have been impossible without IBM’s support,” she says.

Through the lab, Faez Ahmed’s collaboration has facilitated the development of machine learning methods to accelerate the discovery and design of complex mechanical systems. Their Connections the work, for example, uses “generative optimization” to solve engineering problems in a way that is both data-driven and precise; recently apply multimodal data and LLM to computer-aided design. Ahmed says AI is often applied to problems that can already be solved but could benefit from increased speed and efficiency; however, challenges – such as mechanical connections that were considered “almost unsolvable” – are now within reach. “I think it’s definitely a feature [of our MIT-IBM team]” says Ahmed, praising the achievements of his MIT-IBM group, co-chaired by IBM’s Akash Srivastava and Dan Gutfreund.

What began as an initial collaboration between each MIT faculty member has evolved into an enduring intellectual relationship in which both parties are “excited about learning” and “student-oriented,” adds Ahmed. Taken together, the experiences of Jacob Andreas, Yoon Kim, Justin Solomon, Chuchu Fan, and Faez Ahmed demonstrate the impact that sustained, practical relationships between academia and industry can have on the formation of research groups and ambitious scientific exploration.

Inje for personalized planning of artificial intelligence travel

The AI Sckool — Tue, 10 Jun 2025 23:16:35 +0000

Travel agencies lend a hand to provide comprehensive logistics-as transport, accommodation, meals and accommodation-for businessmen, holidays and all. For those who want to make their own findings, gigantic language models (LLM) seem to be a robust tool for exploit in this task because of their ability to interact iterantly through natural language, ensuring reasonable reasoning, collecting information and calling other tools that will lend a hand in a given task. However, recent works have shown that the most state-of-the-art LLM struggles with sophisticated logistics and mathematical reasoning, as well as problems with many restrictions, such as travel planning, in which it was found that they provide profitable 4 percent or less time solutions, even with additional tools and interfaces of application programming (API).

Then the research team MIT and MIT-IBM Watson Ai Lab changed this problem to see if they can boost the success rate of LLM solutions for sophisticated problems. “We believe that many of these planning problems are naturally a problem of combinatorial optimization,” in which you need to satisfy a few restrictions in a way that can be certified, says Chuchu, a professor in the Aeronautics and MIT (Aeroastro) Aeronautics and Astronautics Department and the laboratory of information and decision systems (LIDS). She is also a researcher at the MIT-IBM Watson AI laboratory. Her team uses machine learning, control theory and formal methods of developing protected and verifiable control systems for robotics, autonomous systems, controllers and human interactions.

Paying attention to the transmitted nature of their work in the field of travel planning, the group tried to create a user -friendly framework that can act as AI travel broker to lend a hand develop realistic, logical and full travel plans. To achieve this, scientists combined a common LLM with algorithms and a full solution to satisfaction. Solvers are mathematical tools that strictly check whether the criteria can be met and how, but require sophisticated computer programming for exploit. This makes them natural LLM companions in such problems in which users want to lend a hand in planning in a timely manner, without the need to program knowledge or research on travel options. In addition, if the user’s restriction cannot be met, the novel technique can identify and express where the problem lies, and proposes alternative means to a user who can then choose their acceptance, rejection or modification until the formulation of the correct plan, if it exists.

“Different complexities of travel planning are something that everyone will have to deal with at some point. There are different needs, requirements, restrictions and real information that you can gather,” says the fan. “Our idea is not to ask LLM to propose a travel plan. Instead, LLM acts here as a translator to translate this natural language a problem for the problem that Solver will deal with [and then provide that to the user]- says the fan.

Co -author A paper At work with the fan is Yang Zhang from MIT-IBM Watson Ai Lab, Aeroastro Yilun Hao graduate and a graduate of Yongchao Chen from Mit Lids and Harvard University. These works have recently been presented at the conference of the chapter of the Nations of the American Association of Computing Linguistics.

Solver breaking

“Solver is really a key here, because when we develop these algorithms, we know exactly how the problem is solved as an optimization problem,” says the fan. In particular, the research group used a solution called SATISTIC MODULO (SMT) Theories, which determines whether the formula can be met. “Thanks to this specific Solver, this is not just optimization. It is reasoning over many different algorithms to understand whether the planning problem is possible or not to be solved. This is quite a significant thing in planning travel. This is not a very time-honored problem of mathematical optimization, because people come with all these limitations, restrictions, restrictions, restrictions,” notes the fan.

Translation in action

The “Travel Agency” works in four steps that can be repeated if necessary. Scientists used GPT-4, Claude-3 or Mistral-Large as LLM methods. First of all, LLM analyzes the travel plan required by the user to plan the planning steps, noting preferences regarding budget, hotels, transport, destinations, attractions, restaurants and the duration of travel within days, as well as all other prescriptions for the user. These steps are then converted into a performed Python code (with a natural language annotation for each limit), which is called API interfaces, such as Citysearch, FlightSearch, etc. In order to collect data, and SMT Solver to start taking the steps specified in the problems of restriction satisfaction. If you can find a sound and complete solution, Solver displays the LLM result, which then provides the user with a coherent travel plan.

If one or more restrictions cannot be met, the frame begins to look for an alternative. Solver comes out a code identifying contradictory restrictions (with the appropriate annotation), which LLM provides the user with potential remedies. The user can then decide how to continue until the solution (or maximum number of iterations) is achieved.

Generalized and solid planning

Scientists tested their method using the above-mentioned LLM in relation to other base lines: in itself GPT-4, OpenAI O1-Preview, GPT-4 with information collection tool and a search algorithm that optimizes the total cost. By using the Travelplanner data set, which includes data to real plans, the team looked at many performance indicators: how often the method can provide a solution if the solution met with common criteria, such as not visiting two cities in one day, the ability of the method to meet one or more restrictions, and the final pass indicator indicates that it can meet all restrictions. The new technique generally reached over 90 percent of the pass speed, compared to 10 percent or lower for basic lines. The team also examined the addition of the JSON team at the question stage, which further facilitated the method of providing solutions with a pass rates of 84.4-98.9 percent.

The MIT-IBM team is additional challenges for their method. They looked at how important every element of their solution was – such as the removal of human feedback or Solver – and how it affected the adaptation of the plan to dissatisfied queries within 10 or 20 iteration using a new set of data that he created called Unsatchristmas, which includes invisible restrictions and a modified version of planner Travelplanner. On average, the frames of the MIT-IBM group reached 78.6 and 85 percent of success, which increases to 81.6 and 91.7 percent with additional rounds of plan modification. Scientists analyzed how well they coped well with new, invisible restrictions and paraphrased tips on step and prostate. In both cases it worked very well, especially with 86.7 percent pass for the paraphraction process.

Finally, MIT-IBM scientists applied their frames to other domains with tasks such as the selection of blocks, allocation of tasks, a problem with the traveler and warehouse. Here, the method must choose numbered, colorful blocks and maximize its result; Optimize the task of robot tasks for various scenarios; Plan trips minimizing the distance; and completion of robot tasks and optimization.

“I think it is a very robust and inventive frame that can save a lot of time for people, as well as a very inventive LLM and Solver connection,” says Hao.

These works were partly financed by the Office of Naval Research and MIT-IBM Watson Ai Lab.

Artificial intelligence improves air mobility planning

The AI Sckool — Fri, 25 Apr 2025 22:48:41 +0000

Every day, hundreds of chat messages flow between pilots, crew and air mobility commanders 618. Air operation center (AOC). These controllers manage a thousand all the fleet of aircraft, juggling the variables to determine which routes fly, how much time it will take to bake or charge inventory or who can fly these missions. Their planning of the mission allows the American Air Force quick response to the needs of national security around the world.

“For example, obtaining a missile defense system takes a lot of work, and this coordination was made by phone IE -Mail. Now we use a chat that creates opportunities for artificial intelligence to increase our work department work center.

618. AOC sponsors Lincoln Laboratory to develop these tools of artificial intelligence through the project entitled AI technology for transitional (Caitt).

During a visit to Lincoln, laboratory from the headquarters of the main 618. AOC at Scott Air Force Base in Illinois, Colonel Monaco, Colonel Tim Heaton and captain Laura Quitiquit met with laboratory researchers to discuss Caitt. Caitt is part of a broader effort to transform AI technology into the main initiative of the modernization of the Air Force, called the new generation IT technology for improving readiness for mobility (Nitmre).

The type of AI used in this project is natural language processing (NLP), which allows models to read and process human language. “We employ the NLP to map the main trends in conversations at chat, downloading and quoting detailed information as well as identifying and context of critical decision points,” says Courtland Vandam, researcher at Lincoln Laboratory’s AI Technology and Systems Groupwhich conducts the project. Caitt includes a package of tools using NLP.

One of the most mature tools, summary of the topic, distinguishes trends from chat messages and formats these topics in a user -friendly display emphasizing critical conversations and emerging problems. For example, a popular topic may be: “crew members are lacking in Congo’s visa, delay potential.” The entry shows the number of chats related to the topic and summarizes the bullet points of the main conversation points, combining with specific chat exchanges.

“Our missions are very dependent on time, so we must quickly synthesize a lot of information. This function can really indicate where our efforts should be focused,” says Monaco.

Semantic search is another tool in production. This tool improves the chat search engine, which currently returns empty results if chat messages do not contain every word in the question. By using the new tool, users can ask questions in natural language format, such as why a specific aircraft is delayed and receive intelligent results. “Contains a search model based on neural networks, which can understand the user’s intention and go beyond matching the deadline,” says Vandam.

Other developed tools are designed to automatically add users to chat conversations, which are considered important for their specialist knowledge, predict the amount of time of land needed to discharge specific types of load from aircraft and summarize key processes from regulatory documents as a guide for operators during the development of mission plans.

The Caitt project has grown from the AI DAF-Mit AI accelerator, a three-person effort between myth, Lincoln Laboratory and the Air Force Department (DAF) in order to develop and AI transition systems to develop both DAF and society. “Thanks to our involvement in AI Accelerator through the Nitmre project, we realized that we can do something inventive with all unstructured chat information in 618. AOC,” says Heaton.

When laboratory researchers develop their prototypes of Caitt tools, they began to transfer them to 402nd Software Engineering Group, software suppliers for the Defense Department. This group implements the tools for the operating environment of the software used by 618. AOC.

Teaching the robot its limits so that it can safely perform open-ended tasks

The AI Sckool — Fri, 13 Dec 2024 06:25:39 +0000

If someone advises you to “know your limits,” they are probably suggesting that you exercise in moderation. However, for a robot, the motto represents learning constraints, that is, the limitations of a specific task in a machine environment to perform the work safely and correctly.

For example, imagine asking a robot to neat the kitchen when it doesn’t understand the physics of its surroundings. How can a machine generate a practical, multi-step plan to keep a room spotless? Enormous language models (LLM) can come close, but if the model is trained solely on text, it will likely miss key details about the robot’s physical limitations, such as the distance the robot can reach or whether there are nearby obstacles that need to be avoided. Stick to just LLMs and you’ll probably end up cleaning pasta stains from your floorboards.

To assist robots perform these open-ended tasks, researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) used computer vision models to see what was around the machine and model its limitations. The team’s strategy involves the LLM sketching out a plan, which is checked in a simulator to ensure it is sheltered and realistic. If this sequence of actions is not feasible, the language model will generate a recent plan until it reaches one that the robot can execute.

This trial-and-error method, which researchers call “Planning for Robots via Code for Continuous Constraint Satisfaction” (PRoC3S), tests long-term plans to make sure they meet all constraints and enables the robot to perform tasks as diverse as writing individual letters, drawing stars and sorting and placing blocks in different positions. In the future, PROC3S could assist robots perform more convoluted tasks in lively environments such as homes, where they may be asked to perform a general action consisting of multiple steps (e.g. “make me breakfast”).

“LLM and classic robotics systems, such as task and motion planners, cannot perform these types of tasks on their own, but their synergy enables open problem solving,” says graduate student Nishanth Kumar SM ’24, co-lead author of the recent paper on PROC3S. “We create an ongoing simulation of what is happening around the robot and try many possible action plans. Vision models assist us create a highly realistic digital world that enables the robot to consider feasible actions at every step of a long-term plan.

The team’s work was presented last month in a paper presented at the Conference on Robotic Learning (CoRL) in Munich, Germany.

Play the video

Teaching the robot its limits for open tasks
WITH CSAIL

The researchers’ method uses LLM pre-trained on the basis of text from the Internet. Before asking PROC3S to perform the task, the team provided its language model with an example task (e.g., drawing a square) related to the target task (drawing a star). The sample task includes a description of the activity, a long-term plan, and relevant details about the robot’s environment.

But how did these plans work out in practice? In simulations, PROC3S successfully drew stars and letters eight out of 10 times. It can also arrange digital blocks into pyramids and lines and precisely place items such as fruit on a plate. In each of these digital demonstrations, the CSAIL method performed the desired task more consistently than comparable methods, e.g “LLM3” AND “Code as Rules”.

CSAIL engineers then took their approach to the real world. As part of their method, plans were developed and made on a robot arm, teaching it to arrange blocks in straight lines. PROC3S also allowed the machine to place blue and red blocks into matching bowls and move all objects near the center of the table.

Kumar and co-author Aidan Curtis SM ’23, who is also a graduate student working at CSAIL, say these findings point to how LLM can develop safer plans that people can be sure will work in practice. Scientists imagine a home robot that can be asked a more general request (e.g., “bring me some chips”) and reliably determine the specific steps needed to complete it. PROC3S can assist your robot test its plans in an identical digital environment to find a working course of action and, more importantly, provide you with a tasty snack.

In future work, researchers aim to improve the results using a more advanced physics simulator and expand the work to more convoluted tasks with longer time horizons using more scalable data mining techniques. Moreover, they plan to apply PRoC3S to mobile robots, such as the quadruped, for tasks that include walking and scanning the environment.

“Using basic models like ChatGPT to control robot actions can lead to dangerous or abnormal behavior due to hallucinations,” says Eric Rosen, a researcher at the AI Institute who is not involved in the research. “PRoC3S solves this problem by using foundational models to drive high-level tasks, while applying artificial intelligence techniques that directly analyze the world to ensure verifiable, safe and correct actions. This combination of planning-driven and data-driven approaches may be the key to developing robots capable of understanding and reliably performing a wider range of tasks than is currently possible.”

Kumar and Curtis’ co-authors are also CSAIL collaborators: MIT undergraduate student Jing Cao and MIT Department of Electrical Engineering and Computer Science professors Leslie Pack Kaelbling and Tomás Lozano-Pérez. Their work was supported in part by the National Science Foundation, the Air Force Office of Scientific Research, the Office of Naval Research, the Office of Army Research, the MIT Quest for Intelligence, and the AI Institute.

Study: Viewing negative content online worsens mental health problems

The AI Sckool — Fri, 06 Dec 2024 04:12:04 +0000

According to a series of studies conducted by scientists from MIT, people struggling with mental problems are more likely to view negative content on the Internet, which in turn worsens their symptoms.

The group responsible for the study developed web plugin tool to assist people who want to protect their mental health make more informed decisions about the content they watch.

The findings are presented in an open-access article by Tali Sharotassistant professor of cognitive neuroscience at MIT and professor at University College London, and Christopher A. Kelly, a former visiting graduate student who was a member of Sharot’s Affective Brain Lab at the time the research was conducted and is currently a graduate student at Stanford University’s Institute for Human-Centered AI. The arrangements were made published November 21 in the magazine

“Our study shows a causal, bidirectional relationship between health and what you do online. We found that people who already have mental health symptoms are more likely to apply the Internet and are more likely to view information that ends negatively or is fearful,” says Sharot. “After viewing this content, their symptoms worsen. It’s a feedback loop.”

The study analyzed the web browsing habits of over 1,000 participants, using natural language processing to calculate a negative and positive score for each website visited, as well as ratings of anger, fear, anticipation, trust, surprise, sadness, joy, and disgust. Participants also completed questionnaires to assess their mental health and indicated their mood immediately before and after their internet browsing sessions. The researchers found that participants had a better mood after viewing less negative websites, and participants with a worse mood before browsing tended to view more negative websites.

In another study, participants were asked to read information from two websites randomly selected from six negative or six neutral websites. They then indicated their mood levels both before and after viewing the pages. The analysis found that participants who were exposed to negative websites were in a worse mood than those who viewed neutral websites, and then, when asked to browse the Internet for 10 minutes, visited more negative websites.

“The results contribute to the ongoing debate about the relationship between mental health and online behavior,” the authors wrote. “Most research on this relationship has focused on quantity of use, such as duration of internet use or frequency of social media use, which has led to mixed conclusions. Here, however, we focus on the type of content viewed and find that its emotional properties are causally and bidirectionally related to mental health and mood.

To test whether the intervention could change Internet browsing choices and improve mood, researchers provided participants with search results pages containing three search results for each of several queries. Some participants were assigned labels for each search result on a scale from “feel better” to “feel worse.” The remaining participants did not receive any labels. Those who received labels were less likely to choose negative content and more likely to choose positive content. Further research showed that those who watched more positive content reported significantly better mood.

Based on these findings, Sharot and Kelly created plug-in tool download called the “Digital Diet”, which offers Google search results in three categories: emotions (do people evaluate the content positively or negatively on average), knowledge (to what extent the information on the website helps to understand the topic on average), and feasibility (on average to what extent does the information on the website are useful). MIT electrical engineering and computer science graduate student Jonatan Fontanez ’24, a former MIT research associate in the Sharot lab, also contributed to the development of the tool. The tool was presented to the public this week with the publication of a paper in .

“People with poorer mental health tend to seek out more negative and fear-inducing content, which in turn exacerbates their symptoms and creates a vicious feedback loop,” Kelly says. “We hope this tool will help them gain more autonomy over what is on their mind and break negative cycles.”

Computer automatically reads old language

The AI Sckool — Tue, 02 Jul 2024 14:35:31 +0000

In his 2002 book, Andrew Robinson, then literary editor of a London higher education supplement, argued that “successfully deciphering archaeological puzzles requires a synthesis of logic and intuition… which computers do not possess (and probably cannot possess).”

Regina Barzilay, an assistant professor at the MIT Computer Science and Artificial Intelligence Lab, Ben Snyder, a graduate student in her lab, and Kevin Knight of the University of Southern California took that claim personally. At the annual meeting of the Association for Computational Linguistics in Sweden next month, present a paper on a novel computer system that deciphered a enormous part of the old Semitic language Ugaritic in a matter of hours. In addition to helping archaeologists decipher about eight old languages that have so far resisted their efforts, the work could also assist expand the number of languages that automated translation systems like Google Translate can handle.

To replicate the “intuition” that Robinson says computers can’t grasp, the researchers’ software makes several assumptions. The first is that the language being deciphered is closely related to another language: in the case of Ugaritic, the researchers chose Hebrew. The second is that there’s a systematic way to map the alphabet from one language to the alphabet from another, and that correlated symbols will occur with similar frequency in the two languages.

The system makes a similar assumption at the word level: languages should have at least some cognates or words with common roots, such as in French and Spanish, or and . Finally, the system makes a similar mapping for parts of words. For example, a word like “overloading” has both a prefix — “over” — and a suffix — “ing.” The system predicts that other words in the language will have the prefix “over” or the suffix “ing,” or both, and the cognate of “overloading” in another language — say, “surchargeant” in French — will have a similar three-part structure.

Crosstalk

The system recreates these different levels of correspondence from itself. For example, it might start with several competing hypotheses about alphabetic mappings based entirely on symbol frequency—mapping symbols that occur frequently in one language to those that occur frequently in another. Using a kind of probabilistic modeling common in AI research, the system would then figure out which of these mappings seems to identify a set of consistent suffixes and prefixes. From that, it could look for word-level correspondences, which in turn could assist it refine its alphabetic mappings. “We iterate through the data hundreds, thousands of times,” Snyder says, “and each time, our guesses get more likely because we’re actually getting closer to a solution where we get more consistency.” Eventually, the system reaches a point where changing the mappings no longer improves consistency.

Ugaritic has already been deciphered: otherwise, the researchers would have no way of assessing their system’s performance. The Ugaritic alphabet has 30 letters, and the system correctly mapped 29 of them to their Hebrew equivalents. About a third of the words in Ugaritic have Hebrew equivalents, and of those, the system correctly identified 60 percent. “Of the ones that are wrong, they’re often only wrong by one letter, so they’re often very good guesses,” Snyder says.

In addition, he points out, the system currently does not operate any contextual information to resolve ambiguities. For example, the Ugaritic words for “house” and “daughter” are spelled the same way, but their Hebrew equivalents are not. Although the system might occasionally confuse them, a human decipherer would easily recognize from the context what was intended.

Bubble

Still, Andrew Robinson remains skeptical. “If the authors believe that their approach will ultimately lead to computerized ‘automatic’ reading of currently undeciphered scripts,” he writes in an email, “then I’m afraid I’m not convinced by their work at all.” The researchers’ approach, he says, assumes that the language to be deciphered has an alphabet that can be mapped onto that of a known language—“which is almost certainly not the case for any of the other important undeciphered scripts,” Robinson writes. It also assumes, he says, that it’s clear where one character or word ends and another begins, which is not true for many deciphered and undeciphered scripts.

“Each language has its own challenges,” Barzilay agrees. “Successful decipherment would most likely require a method tailored to the language.” But, as he points out, deciphering Ugaritic took years and relied on a few lucky breaks—such as the discovery of an axe with the word for “axe” written on it in Ugaritic. “The output from our system would shorten that process by orders of magnitude,” he says.

Indeed, Snyder and Barzilay don’t think a system like the one they designed with Knight would ever replace human decryptors. “But it’s a powerful tool that can help the human decryption process,” Barzilay says. What’s more, a variation of it could also assist extend the versatility of translation software. Many online translators rely on parallel text analysis to determine word correspondences: They might, for example, look through the collected works of Voltaire, Balzac, Proust, and many other writers, in both English and French, looking for consistent mappings between words. “That’s how statistical translation systems have worked for the last 25 years,” Knight says.

But not all languages have such comprehensively translated literature: Snyder points out that Google Translate currently works in only 57 languages. The techniques used in the decryption system could be adapted to create lexicons for thousands of other languages. “The technology is very similar,” says Knight, who works in machine translation. “They feed off each other.”

Healthcare of the future

The AI Sckool — Tue, 02 Jul 2024 03:28:39 +0000

In 1974, Peter Szolovits predicted: By the 1980s, most enormous hospitals would adopt electronic health records. Although the technology has not developed as quickly as expected, the U.S. government is now making a major effort to move hospitals from paper files filled with notes to a secure and capable electronic system for collecting, storing, and retrieving medical records. Today, Szolovits—a professor of computer science and engineering, health sciences, and technology and leader of the Clinical Decision Making Group at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL)—is at the forefront of a movement to make health information technology more effective and useful to both providers and patients.

Szolovits’ current focus is on using natural language processing to better understand medical records, which often contain unstructured narrative notes and abbreviations. As part of the government’s push for electronic health records, Szolovits is a participant in one of four Strategic Health IT Advanced Research Projects (SHARP) from the Office of the National Coordinator for Health IT. The project, led by colleagues at the Mayo Clinic, focuses on the secondary operate of electronic health records, providing a way to analyze clinical data sets for research, quality assessment, and public health support. The main technical challenges include developing natural language processing methods that enable computer programs to automatically extract clinical files, events, and relationships from narrative text in the records, and to combine the extracted data with existing information from laboratory tests and physician orders to identify patient conditions and current treatments.

“My interest in natural language processing was rekindled about 12 years ago when I noticed that a lot of important medical data was actually locked up in these narrative notes, and we needed to have some way to dig them out so we could use them,” Szolovits says.

William Long, a principal research scientist at Clinical Decision Making Group who has worked with Szolovits for more than 30 years, has developed several programs that scan electronic nursing reports and intensive care unit (ICU) discharge papers, looking for key words and phrases that provide clues about a patient’s condition. Drawing on information gleaned from the Uniform Medical Language System, a compilation of more than 150 medical dictionaries, Clinical Decision Making Group has programmed the system to identify a comprehensive list of terms and key concepts. The program can then provide physicians with a brief summary of the compiled information.

Until now, the technology has been used more for clinical trials than for diagnostic purposes. For example, it was used to gather a group of patients who had the same disease but responded differently to identical treatments. With systems that can organize and analyze medical records, doctors can discover whether genetics, external medications, or personal habits have affected a patient, and learn more about which treatments are most effective.

Szolovits still believes in using AI in the diagnostic process, but in a different way than he originally imagined. He and his colleagues—Roger Mark; George Verghese; and colleagues from CSAIL, the Laboratory for Information and Decision Systems, Harvard-MIT Health Sciences and Technology, and Beth Israel Deaconess Medical Center—collected data on about 35,000 ICU admissions at a enormous Boston hospital. CSAIL graduate student Caleb Hug used the data to create predictive models that, for each significant change in a patient’s condition, estimate how well they’re likely to fare in the future. Such acuity models can warn doctors of danger and are also useful in determining the resources needed to aid a particular patient.

The same methodology can also be used for more detailed predictions. Hug’s research used the technique to predict when it would be secure to wean patients off life-support devices such as ventilators and intra-aortic pumps, as well as whether a patient was likely to develop septic shock, hypotension, or kidney failure.

Outside the ICU, the group is tackling the challenge of recording doctor-patient interactions and translating those conversations into actionable information. For a project at Children’s Hospital Boston, group members recorded about 100 encounters at the Pediatric Environmental Health Center, where doctors often see cases of lead poisoning in children. Each doctor-patient interaction is recorded, translated into English text using a speech-understanding program, and then analyzed for key terms and phrases using a natural-language processing program developed by assistant professor of computer science and engineering Regina Barzilay and her group.

Although the project has proven challenging due to the difficulty of building a high-fidelity speech-understanding component, Szolovits believes it could improve medical visits for doctors. In the future, Szolovits says, the technology could also be used by hospital nurses so they can focus more on patient care than taking notes.

Despite the challenges of moving to more technologically advanced systems, Szolovits and Long believe that natural language processing programs like those developed by the Clinical Decision Making Group could significantly improve the quality of medical care.

“I think it’s going to revolutionize healthcare. It’s going to be much easier to get all of this information in a form that we can actually do something with and process it in a way that benefits everyone,” Long says. “Doctors will benefit from being able to look at years of patient care and turn it into research about what’s working, what’s not working, how to improve care, how to treat patients better.”

For more information about the Clinical Decision Making Group, please visit: http://groups.csail.mit.edu/medg/.

The Advantage of Ambiguity

The AI Sckool — Mon, 01 Jul 2024 16:25:06 +0000

Why did language evolve? While the answer may seem obvious – as a way for individuals to exchange information – linguists and other communication researchers have been debating this question for years. Many prominent linguists, including MIT’s Noam Chomsky, argue that language is actually poorly designed for communication. They argue that such utilize is merely a byproduct of a system that probably evolved for other reasons – perhaps to structure our private thoughts.

As evidence, these linguists point to the existence of ambiguity: They argue that in a system optimized for transmitting information between speaker and listener, each word would have only one meaning, eliminating any chance of confusion or misunderstanding. Now a group of cognitive scientists at MIT have turned that idea on its head. In a up-to-date theory, they argue that ambiguity actually makes language more productive by allowing the reuse of brief, effective sounds that listeners can easily disambiguate through context.

“Various people have argued that ambiguity is a communication problem,” says Ted Gibson, a professor of cognitive science at MIT and senior author of a paper describing the research, forthcoming in the journal Cognition. “But the fact that context disambiguates has important implications for the reuse of potentially ambiguous forms. Ambiguity is no longer a problem – it can be exploited because it can be easily reused [words] in different contexts, many times.”

The paper’s lead author is Steven Piantadosi, a 2011 Ph.D. student; co-author is Harry Tily, assistant professor at the Department of Brain Sciences and Poznań.

What do you mean’?

As a somewhat ironic example of ambiguity, consider the word “to signify.” Of course, it can mean to indicate or signify, but it can also refer to an intention or purpose (“I was going to go to the store”); something offensive or nasty; or the mathematical average of a set of numbers. Adding the “s” introduces even more potential definitions: an instrument or method (“a means to an end”) or financial resources (“to live within one’s means”).

Yet virtually no English speaker is confused when they hear the word “mean.” This is because the different meanings of the word occur in such different contexts that listeners can almost automatically infer its meaning.

Given the power of context to discriminate, researchers hypothesized that languages might exploit ambiguity to reuse words—presumably the easiest words for language-processing systems to handle. Based on observations and previous research, they assumed that words with fewer syllables, high frequency, and simplest pronunciations should have the most meaning.

To test this prediction, Piantadosi, Tily, and Gibson conducted corpus studies of English, Dutch, and German. (In linguistics, a corpus is a enormous collection of samples of language in its natural utilize that can be used to find word frequencies or patterns.) By comparing certain properties of words with their number of meanings, the researchers confirmed their suspicions that shorter, more repeated words, as well as those that follow the typical sound patterns of the language, were most likely to be ambiguous – these trends were statistically significant in all three languages.

To understand why ambiguity makes language more productive, not less, think about the conflicting desires of a speaker and a listener. The speaker is interested in conveying as much as possible using as few words as possible, while the listener is trying to gain a full and detailed understanding of what the speaker is trying to say. But, the researchers write, it is “cognitively cheaper” for the listener to infer things from context than for the speaker to spend time on longer, more complicated utterances. The result is a system that leans toward ambiguity, reusing the “easiest” words. When context is taken into account, it becomes clear that “ambiguity is something you actually want in a communication system,” Piantadosi says.

Tom Wasow, professor of linguistics and philosophy at Stanford University, calls the article “important and insightful.”

“You would expect that since languages are constantly changing, they would evolve to get rid of ambiguity,” Wasow says. “But if we look at natural languages, they are enormously ambiguous: words have many meanings, there are many ways to parse strings of words. … This article makes a really rigorous argument for why this kind of ambiguity is actually functional for communication purposes rather than dysfunctional.”

Implications for computing

The researchers say the statistical nature of their paper reflects a trend in linguistics that is increasingly relying on information theory and quantitative methods.

“The impact of computer science on linguistics is very large right now,” says Gibson, adding that natural language processing (NLP) is the main focus of those working at the intersection of the two fields.

Piantadosi points out that the ambiguity of natural language poses enormous challenges for NLP programmers. “Ambiguity is good for us [as humans] because we have these really sophisticated cognitive mechanisms for discrimination,” he says. “It’s really hard to figure out the details of what they are, or even some approximation that a computer could use.”

But, Gibson says, computer scientists have long been aware of the problem. The up-to-date study provides a better theoretical and evolutionary explanation for why the ambiguity exists, but the message is the same: “Basically, if you have any human language in your input or output, you’re stuck needing context to disambiguate,” he says.

Mining doctors’ notes regarding medical observations

The AI Sckool — Mon, 01 Jul 2024 05:16:53 +0000

Over the last 10 years, it has become much more common for physicians to keep records electronically. Such documentation can contain a wealth of medically useful data: hidden correlations between symptoms, treatments, and outcomes, for example, or clues that patients are promising candidates for up-to-date drug trials.

However, most of this data is hidden in doctors’ arbitrary notes. One of the difficulties in extracting data from unstructured text is what computer scientists call word sense disambiguation. For example, in a doctor’s notes, the word “discharge” may refer to a bodily discharge, but it may also refer to a discharge from the hospital. The ability to infer the intended meaning of words makes it much easier for computers to find useful patterns in mountains of data.

Next week at the American Medical Informatics Association’s (AMIA) annual symposium, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory will present a up-to-date system for disambiguating the meaning of words used in doctors’ clinical notes. The system is on average 75% true in disambiguating words with two meanings, which is a marked improvement over previous methods. But more importantly, says Anna Rumshisky, an MIT postdoc who helped lead the up-to-date research, it represents a fundamentally up-to-date approach to word disambiguation that could lead to much more true systems while drastically reducing the amount of human effort needed to develop them.

Indeed, Rumshisky says, the paper that was originally accepted at the AMIA symposium described a system that used a more conventional approach to word disambiguation, with an average accuracy of only about 63 percent. “In our opinion, it wasn’t enough to make it actually usable,” says Rumshisky. “So instead, we tried something that had been tried before in the general domain, but never in the biomedical or clinical domain.”

Topical utilize

Specifically, Rumshisky explains, she and her coauthors—graduate student Rachel Chasin, whose master’s thesis is the basis for the up-to-date work; Peter Szolovits, a professor of computer science and engineering and health sciences and technology at MIT; and researcher Özlem Uzuner, who earned his Ph.D. at MIT and is now an assistant professor at the University at Albany—adapted algorithms from a research area known as topic modeling. Topic modeling aims to automatically identify the topics of documents by inferring relationships between salient words.

“The bias we’re trying to transfer from the general domain is treating instances of the target word as documents and the senses as latent topics that we’re trying to infer,” Rumshisky says.

While a regular topic modeling algorithm searches huge corpuses of text to identify clusters of words that tend to appear in close proximity to each other, Rumshisky and her colleagues’ algorithm identifies correlations not only between words, but also between words and other text “features” – such as the syntactic roles of words. For example, if the word “discharge” is preceded by an adjective, it is much more likely to refer to a bodily discharge than to an administrative event.

Typically, topic-modeling algorithms assign different weights to different topics: for example, a single news article might be 50 percent about politics, 30 percent about the economy, and 20 percent about foreign affairs. Similarly, the up-to-date algorithm from MIT researchers assigns different weights to the different possible meanings of ambiguous words.

One advantage of topic-modeling algorithms is that they are unsupervised: They can be deployed on enormous text collections without human supervision. As a result, researchers can continuously improve their algorithm to include more features, and then publish it on unscripted medical records to draw their own conclusions. The more features it includes, the more true it should be, Rumshisky says.

Recommended attractions

Among the features the researchers plan to incorporate into the algorithm are lists in the massive dictionary of medical terms developed by the National Institutes of Health, called the Unified Medical Language System (UMLS). Indeed, word associations in UMLS were the basis for the researchers’ original algorithm—the one that achieved 63 percent accuracy. The problem there was that the length and structure of the paths from one word to another in UMLS did not always correspond to the semantic differences between the words. But the up-to-date system internally identifies only those correspondences that occur with enough frequency to be likely to be useful.

“Parts [UMLS] important for sensory discrimination would basically float to the top,” says Rumshisky. “It kind of gives you that association, if it’s valid, for free. If it’s not important, it just doesn’t matter.”

Scientists are also experimenting with additional syntactic and semantic features that may aid with word disambiguation and word associations established under the NIH Medical Subject Headings paper classification scheme. “It’s still not perfect because we haven’t integrated all the language features we wanted,” says Rumshisky. “But I have a feeling this is the right way.”

“About 80 percent of clinical information is hidden in clinical notes,” says Hongfang Liu, assistant professor of health informatics at Mayo Clinic. “Many words or phrases are ambiguous there. So to get the correct interpretation, you have to go through a word disambiguation phase.”

Liu says that although some computational linguists have applied topic modeling algorithms to the problem of word disambiguation, “I feel like they’re working on toy problems. I think in this case it could actually be applied to production-scale natural language processing systems.”