DuckDB as a DrugDB: a Free and Simple Multi-Model Drug and Trial Database

A fourth case study of the clinical trials

Sixing Huang
11 min readOct 15, 2024

--

Generated by https://pixlr.com/image-generator/.

Drug trials test the safety, efficacy, and effectiveness of new drugs or treatments in human subjects. However, these trials are generally very costly. 101 new molecular entities between 2015 and 2017 had an estimated median cost of US$48 million (Moore et al.). One major cost factor is recruiting volunteers because many patients are unaware of these opportunities. An easy-to-use, patient-facing drug trial database can address this challenge. It will significantly boost awareness of clinical trials, making it easier for patients to find and participate in studies that align with their specific health conditions and treatment goals.

In my previous articles, “Clinical Trial Search with Google Spanner: Graph, SQL, Vector, and LLM All in One Query”, “PostgreSQL Goes Multi-Model: Graph, Vector, and SQL”, and “Build a Drug Trial Database with the Multi-Model SurrealDB”, I compared Google Spanner, PostgreSQL, and SurrealDB using a combined dataset containing over 5000 drugs, 2000 disorders, and 2000 clinical trials. These three multi-model databases were selected due to the multi-modal nature of the dataset, which made a combination of SQL, graph, vector, and full-text searches necessary.

--

--

Sixing Huang
Sixing Huang

Written by Sixing Huang

A Neo4j Ninja, German bioinformatician in Gemini Data. I like to try things: Cloud, ML, satellite imagery, Japanese, plants, and travel the world.