In-Context Operator Networks (ICON) for Large Scientific Learning Models
Can we build a single large model for a wide range of scientific problems?
We proposed a new paradigm for scientific machine learning, namely “In-Context Operator Networks” (ICON). A distinguishing feature of ICON is its ability to learn operators from data prompts during the inference phase, without weight adjustments. A single ICON model can tackle a wide range of tasks involving different operators, since it is trained as a generalist operator learner, rather than being tuned to approximate a specific operator. This is similar to how a single Large Language Model can solve a variety of natural language processing tasks specified by the language prompt.
Tracing the evolution of neural equation solvers, we see a three-act progression: Act 1 focused on approximating the solution function, while Act 2 shifted towards approximating the solution operator. ICON can be viewed as an early attempt of Act 3, where the model acts like an intelligent agent that adapts to new physical systems and tasks.
In our paper published in PNAS, we showed how a single ICON model manages 19 distinct problem types, encompassing forward and inverse ODE, PDE, and mean-field control (MFC) problems, each including infinite operators. Later in this paper, we fine-tuned the GPT-2 model as a multi-modal differential equation solver, prompting the model to perform scientific machine learning with human language and LaTeX equations, apart from numerical data. In this paper, we showed how a single ICON model can make forward and reverse predictions for different PDEs with different strides, and generalize well to PDEs with new forms, without any fine-tuning.