BLOG

Research

PRoduct

July 06, 2025・By AIxiv

A research team from Professor Biwei Huang’s lab at the University of California, San Diego (UC San Diego) has proposed an autonomous causal analysis agent called Causal-Copilot. The lab specializes in the intersection of causal reasoning and machine learning, achieving multiple significant results in causal discovery and causal representation learning.

 

 

 

A Common Dilemma

 

Imagine this scenario: you’re a biologist holding gene expression data, with the intuition that certain genes regulate each other. But how can you scientifically verify this? You’ve heard of “causal discovery,” but even the names of algorithms like PC and GES are unfamiliar to you.

 

Or you’re a sociologist who wants to evaluate the true impact of an education policy on student performance. You know that simple comparisons can be affected by other factors, but when faced with methods like difference-in-differences or propensity score matching — each with its own assumptions — you’re at a loss for where to start.

This is the current state of causal analysis: theory is becoming richer, tools are more powerful, but the barrier to entry remains high.

 

 

The Limitations of Pre-trained Models

 

Today’s AI systems, including the most advanced large language models, are essentially pattern recognizers. They can detect that “A and B often occur together,” but they cannot determine whether “A causes B,” “B causes A,” or “C influences both A and B.”

 

This limitation can have serious consequences in practice. For example, data might show that students who use a certain education app have higher grades. A correlation-based AI may recommend promoting the app to improve grades. But causal analysis could reveal that it’s actually high-achieving students who are more likely to use the app, not that the app improves performance.

 

 

Causal Analysis Consists of Two Core Tasks

 

  • Causal Discovery - identifies causal relationships between variables from data, building a causal graph to reveal how a system operates.
  • Causal Inference - estimates intervention effects based on the causal graph, answering “What would happen if we did this?”

 

These two tasks are complementary, forming a complete picture of how the world works.

 

However, mastering these methods requires deep statistical expertise and extensive practical experience. Each algorithm has its own applicable scenarios and limitations, and the wrong choice can lead to completely incorrect conclusions. This expertise barrier excludes many researchers who need causal analysis.

 

 

Causal-Copilot - Making the Complex Simple

 

We propose an elegant solution: since the main difficulty in causal analysis lies in method selection and parameter tuning, why not let AI handle this part?

Causal-Copilot was built on this idea — an autonomous causal analysis agent. Its strength lies in its unprecedented comprehensiveness: integrating more than 20 state-of-the-art causal analysis algorithms for true one-stop causal analysis. Whether your data is tabular or time series, linear or complex nonlinear, clean or noisy observational data, Causal-Copilot can automatically find the right analytical method.

 

 

A Unified Intelligent System for

Causal Discovery and Inference

 

The core innovation of Causal-Copilot is automating the entire causal discovery and causal inference workflow. It integrates 20+ algorithms covering the full process — from structure learning to effect estimation:

 

 

Causal Discovery

 

  • Automatically identify causal relationships between variables and build causal graphs.
  • Handle a variety of data types: linear/nonlinear, discrete/continuous, static/time series, Gaussian/non-Gaussian noise.
  • Address real-world challenges like latent confounders, missing data, and data heterogeneity.
  • Built-in CPU/GPU acceleration for large-scale, high-dimensional scenarios.

 

 

Causal Inference

 

  • Estimate intervention effects based on discovered causal structures.
  • Support average treatment effects, heterogeneous effects, and counterfactual reasoning.
  • Provide uncertainty quantification and robustness checks for effect estimates.

 

 

Modular Architecture

 

Causal-Copilot adopts a modular architecture with five core components:

 

  1. User Interaction Module - Supports natural language queries and interactive feedback, e.g., specifying preferences and constraints.
  2. Preprocessing Module - Comprehensive data preparation, including missing value detection/imputation, feature transformation, schema extraction, and statistical diagnostics for tabular and time-series data.
  3. Algorithm Selection Module - Filters and ranks algorithms based on data characteristics and expert/empirical knowledge; configures hyperparameters; executes algorithms and handles errors.
  4. Postprocessing Module - Uses bootstrap validation, LLM commonsense reasoning, and user feedback to refine causal graphs; conducts sensitivity and robustness analyses for causal effects.
  5. Report Generation Module - Compiles analysis results into user-friendly visual research reports, including the full causal analysis process and LLM-generated insights.

 

 

Multi-Dimensional Evaluation of

Causal Discovery and Inference

 

We systematically evaluated Causal-Copilot’s performance across various causal discovery and inference scenarios, including both time-series and non-time-series data.

 

On tabular data, we covered baseline settings, data quality challenges (heterogeneous domains, measurement error, missing data), and composite scenarios (clinical, financial, social network data). The system maintained strong performance even in very large networks with up to 1000 nodes.

 

For time-series data and causal inference tasks, the evaluations likewise confirmed the system’s strong adaptability. On CSuite benchmarks and real datasets, Causal-Copilot significantly outperformed both:

 

  • GPT-4o directly invoking causal algorithms
  • Existing traditional causal discovery algorithms

 

 

 

 

 

Reference

Xinyue Wang, Kun Zhou, Wenyi Wu, Har Simrat Singh, Fang Nan, Songyao Jin, Aryan Philip, Saloni Patnaik, Hou Zhu, Shivam Singh, Parjanya Prashant, Qian Shen, Biwei Huang. "Causal-copilot: An autonomous causal analysis agent." arXiv preprint arXiv:2504.13263 (2025).

By unifying the entire causal discovery and inference pipeline, Causal-Copilot enables researchers to fully understand causal mechanisms, make reliable decisions, and accelerate scientific discovery. The research team has fully open-sourced the system, providing code, tutorials, and an online demo platform — inviting researchers worldwide to collaborate on improvements.

/ Blog /

Integrated with 20+ State-of-the-Art

Algorithms, Surpassing GPT-4o

The Autonomous Causal Analysis Agent is Here

© 2025 Abel.AI, Inc. All rights reserved.

Office San Francisco

X →

linkedin →

CONTACT hi@abel.ai