I'm a second year PhD student in the InfoLab at MIT working with Boris Katz. I'm interested in multi-modal models for intelligence, using methods in computer vision and natural language processing (NLP). At present, I'm working on planning algorithms using NLP grounded in perception. My work aims to both tackle classical planning problems and novel robotic and visually grounded planning problems. I use use computer vision techniques and NLP to solve real-world planning problems. In general, the idea is to learn the plans of agents by observation and then, when faced with a new scene, to generate the motion of the agent and what sentence would describe their plan. In essence, we use English as the planning language allowing us to take advantage of abstraction and ambiguity and we learn to plan efficiently by watching other agents. This can be used in visually grounded domains like in robotic interaction to understand what other agents will do in the future. In the long term I intend to apply this work to more classical domains, for example planning given a database of facts, by grounding natural language in programs over those facts.