Wednesday, December 25, 2024

Using language to understand machines

Share

Natural language conveys ideas, actions, information, and intentions through context and syntax; in addition, there are volumes of them contained in databases. This makes it an excellent source of data for training machine learning systems. Two master’s of engineering students in MIT’s 6A MEng thesis program, Irene Terpstra ’23 and Rujul Gandhi ’22, work with mentors in the MIT-IBM Watson AI Lab to harness the power of natural language to create artificial intelligence systems.

As computing becomes more advanced, researchers strive to improve the hardware on which they operate; this means innovating to create fresh computer chips. With literature already available on modifications that can be made to achieve specific parameters and performance, Terpstra and her mentors and advisors Anantha Chandrakasan, dean of the MIT School of Engineering and Vannevar Bush professor of electrical engineering and computer science, and IBM researcher Xin Zhang are developing an artificial intelligence algorithm , which helps with chip design.

“I’m creating a workflow to systematically analyze how these language models can help in the circuit design process. What reasoning capabilities do they have and how can they be integrated into the chip design process?” says Terpstra. “And on the other hand, if it proves useful enough, [we’ll] see if they can automatically design the chips themselves by attaching them to a reinforcement learning algorithm.”

To do this, Terpstra’s team is building an artificial intelligence system that can iterate across projects. This means experimenting with various pre-trained models of huge languages ​​(such as ChatGPT, Llama 2, and Bard) using an open-source circuit simulator language called NGspice, which has on-chip code parameters and a reinforcement learning algorithm. With text prompts, researchers will be able to ask how the physical chip should be modified to achieve a specific goal in the language model, and develop guidelines for adjustments. This data is then fed to a reinforcement learning algorithm, which updates the circuit design and generates fresh physical parameters for the chip.

“The ultimate goal would be to combine the reasoning ability and knowledge base contained in these large language models and combine that with the optimization power of reinforcement learning algorithms and the design of the chip itself,” Terpstra says.

Rujul Gandhi works with raw language alone. As an MIT student, Gandhi explored linguistics and computer science, combining them in her MEng work. “I was interested in communication, both between people and between people and computers,” says Gandhi.

Robots or other interactive artificial intelligence systems are an area where communication must be understood by both humans and machines. Scientists often write instructions for robots using formal logic. This helps ensure that commands are executed safely and as intended, but formal logic can be hard for users to understand, whereas natural language is uncomplicated to understand. To ensure this seamless communication, Gandhi and her advisors Yang Zhang of IBM and MIT assistant professor Chuchu Fan are building a parser that converts natural language instructions into a machine-friendly form. Using a linguistic structure encoded by a pre-trained T5 encoder-decoder model and a dataset of annotated basic English commands to perform specific tasks, Gandhi’s system identifies the smallest logical units, or atomic propositions, that are present in a given instruction.

“Once instructions are given, the model identifies all the smaller sub-tasks it needs to perform,” says Gandhi. “Then, using a large language model, each subtask can be compared to the available actions and objects in the robot’s world and if any subtask cannot be performed because a specific object was not recognized or the action is not possible, the system can stop at that point and ask the user for help.”

This approach of dividing instructions into subtasks also allows its system to understand logical relationships expressed in English, such as “do task X until event Y happens.” Gandhi uses a dataset containing step-by-step instructions for robot task domains such as navigation and manipulation, with a focus on household tasks. Using data written down in the same way people would talk to each other has many advantages, he says, because it means the user can formulate their instructions more flexibly.

Gandhi’s next project is to develop speech models. In the context of speech recognition, some languages ​​are considered “low resource” because they may not contain much transcribed speech or have no written form at all. “One of the reasons I applied for an internship at the MIT-IBM Watson AI Lab was because I was interested in language processing for low-resource languages,” he says. “Many modern language models rely heavily on data, and when it’s not that easy to get all the data, then you need to use the limited data efficiently.”

Speech is simply a stream of sound waves, but people having a conversation can easily figure out where words and thoughts begin and end. In speech processing, both humans and language models employ existing vocabulary to recognize word boundaries and understand meaning. In languages ​​where resources are narrow or nonexistent, written vocabulary may not exist at all, so researchers cannot supply it to the model. Instead, the model can note which sound sequences appear together more often than others and infer that these may be single words or concepts. In Gandhi’s research group, these inferred words are then collected into a pseudo-dictionary that serves as a low-resource language labeling method, creating labeled data for further applications.

Applications for language technologies are “almost everywhere,” says Gandhi. “You can imagine that people can interact with software and devices in their native language, in their native dialect. You can imagine improving all the voice assistants we use. “You can imagine it being used for interpreting or written translation.”

Latest Posts

More News