top of page


Research Lab


Open Codes/Data Repositories


Panel Tree Package in R, TreeFactor on GitHub

Based on "Growing Panel Trees to Harvest Basis Assets and Pricing Kernels" by Cong, Feng, He, & He (2022)

Textual Factors Package in Python on GitHub

Based on "Textual Factors: A Scalable, Interpretable, and Data-Driven Approach to Analyzing Unstructured Information" by Cong, Liang, & Zhang (2019)

Ethereum Ecosystem: Data and Visualization

Based on "Inclusion and Democratization Through Web3 and DeFi? Initial Evidence from the Ethereum Ecosystem" by Cong, Tang, Wang, & Zhao (2022) 

Value-Price Divergence Factor (1978-2018)

Download the factors here. ReadMe.

Based on "RIM-based Value Premium and Factor Pricing Using Value-Price Divergence" by Cong, George, & Wang (2022) 

Research Databases  


  • Crypto/Blockchain/DeFi-Related (primarily via DEFT Lab)

    • Chainalysis​

    • DeFiLama

    • DeFi Pulse

    • Dune Analytics

    • Ethereum Improvement Proposals

    • Etherscan

    • Kaiko

    • Moonstream

    • Tronscan

  • General Financial Markets Data (via DEFT Lab and FinTech Initiative)

    • ETF Global​

    • E-Commerce Transactions and Online Merchant Survey from

    • WIND (for Chinese Financial and Economic Data)

  • Corporate Data (via WRDS for SC Johnson affiliates)

    • BoardEx (North America)​

    • CRSP

    • Compustat

    • I/B/E/S Factset ownership v5

    • OptionMetrics

    • RavenPack

    • RepRisk

  • Others (via SC Johnson Research Servers or Johnson Research Library)

    • FISD​

    • ShortInterest (1997-2012)

    • TAQ/DATQ

    • Financial Times fDi markets (thanks to EMI and M&O area)

  • Other Datasets:

    • Bloomberg​

    • Capital IQ Pro

    • Edgar

    • Eikon

    • SDC Platinum

    • Enterprise Survey Data in China from Peking University

    • Online Survey of Micro-and-small Enterprises in China (OSOME)

Computing and Storage Resources


Cornell resources

  • CISER - Cornell Center for Social Sciences

    • Six publicly shared high-powered virtual machines available to all Cornell University students. This resource is sufficient for most CPU- and memory-intensive applications. 

  • Johnson Management research server

    • A collection of three virtual machines hosted by Johnson. Access is limited to Cornell students in the field of Management (Johnson Graduate School of Management).

  • Cornell CAC - Cornell University Center for Advanced Computing

    • The only Cornell service offering GPU-powered (as well as conventional CPU-powered) instances. Rates depend on the type of instance created.

  • Cornell BOX service for Unlimited Storage

    • Details available upon request.


Outside resources

Limited computing resources are provided free of charge by Amazon (Amazon Web Services - AWS), Google (Google Cloud Services - GCS) and Microsoft (Microsoft Azure). Each service has a similar but different set of free services available.

  • AWS - Amazon Web Services (free tier)

  • GCS - Google Cloud Services (free tier)

  • Azure - Microsoft Azure (free tier or through lab grants)

  • Xi'an Jiaotong University Computer Clusters for Data from the Ethereum Ecosystem (through lab collaboration)

  • Dropbox Professional (details available upon request)

Resources for Members and Affiliates  


bottom of page