Transcript
A (2:46)
Ganju action boy language agent language agent language model Lahojanja agent open operator is agent the observation space coding search agent text browser browser feature that reward exploration fancy Samya startup Yeshua open agentic intelligence was computer usage and operator agent environment observation so you know computer for inference and computer open agentic intelligence large scale agentic data synthesized pipeline general the reinforcement learning framework self critique rubber rubric rewards rubrics and reward the input output data rubric data annotation okay language agent language model the language model style and perspective diverse prompting fascinating generation John not judge generation the whole in the middle just be Zhao knowledge related data your accuracy like evaluation Wikipedia data now oh architecture the efficiency Jigga follow Jigga tag influence agent diversity diverse scale data synthesize for two years learning just interface or description expertise is diversity doing the rubrics trajectory configuration system prompt how doing the tool simple to complex the operation user simulation communication distributions equivalent to a water model simulation so Isha Tidawa sugar hybrid approach Ego Hamdan Pajiga Chamber research search agent now major search the API research Michelle website or server browser to Jiaohua machine environment is sense language model doing the shallow mass sensation creative writing to be referred to reward logic tasks diverse coverage and moderate difficulty moderate difficulty the mature leader task and Shujati Fatih Naja reward a signal yes difficulty easy to verify but hard to answer case by case complex instruction following hybrid rule verification verification instruction output verification constraint language agent complex instruction following the verification buffet instruction data and actually verify now sample the data generation pipeline human expert annotated data task Jamaica holder education follow instruction so thinking process sentence level the face judge model language model language agent behavior IT Kubernetes solution host Dagui model concurrent sandbox behavior beyond the verification self critique rubrics reward creative writing helpfulness creativity depth of reasoning factuality safety ha language agent take action to action there Yona just concurrent environment you expose step or reset truly interactive environments browser for computer use recorder operator browsing agent deep research agent action space but operator web page browser deeper research operator browsing agent deep research Laho Jao yeah Sudah Dan customization use case the last exam of my client the frontier math yes you got Fei Chun benchmark task computer use agent but on the whole performance action honor synthetic data require scale up data scale up code aisle is hard to solve but easy to verify go to parallel the environment kian engineering time kv cash so yoha kv cash transformer yeah huanzi mcpaid then observation 2 do asian vill Wabi Asian such but Agent Jacob exploration procedure knowledge language model agent self improved annotator no for the data research model case yeah don't go Shuja is the king your show Bow Polish paper who are writing as well Germany unit had a long contact a family of Asian cities creativity Tony Coding agent Should Alhandu consider augmented augmented code la Should I just ping Liu Ye now Misaji Zia bye bye.
