With a very small budget, the LIGHTS project demonstrates how a practical, globally distributed research infrastructure can be built using familiar, readily available tools. Health researchers across continents collaborate efficiently on a highly specialized collection for methodological guidance - without complex IT operations or expensive software.
Table of Contents
- Behind LIGHTS: An Infrastructure for Research on Methodological Guidelines
- From Concept to Practice: Collaboration Between University of Basel and Karakun
- The Adapter: Three Research Data Pipelines, One Workflow
- Key Benefits of the LIGHTS Research Architecture
- Component Schema Overview
- Conclusion
- Let’s Connect
Behind LIGHTS: An Infrastructure for Research on Methodological Guidelines
The Library of Guidance for Health Scientists (LIGHTS) is a living inventory of more than 2,000 hand-selected methods guidance papers. Its mission is to help clinical researchers find the best available methodological guidance to design and conduct high-quality studies.
Many clinical studies have avoidable limitations due to poor methodological decisions - such as inadequate study design, measurement bias, or flawed statistical analysis - even though suitable guidance documents exist. LIGHTS addresses this gap by systematically identifying, classifying, and making available methodological guidance documents.
From Concept to Practice: Collaboration Between University of Basel and Karakun
The project is led by Dr. Stefan Schandelmaier at the University of Basel and technically supported by Karakun AG.
At the heart of the platform lies HIBU, Karakun’s search and AI platform, providing an intuitive and interactive search experience. To feed HIBU with structured and continuously updated metadata, Karakun developed a special adapter system - a set of Java-based tools orchestrated through GitHub CI/CD pipelines.
The Adapter: Three Research Data Pipelines, One Workflow
1. Paperpile and Google Drive Integration
- Researchers collect and curate literature in Paperpile, a web-based reference manager.
- A BibTeX export is automatically uploaded to a GitHub repository, triggering the first pipeline.
- This pipeline converts the BibTeX file into a CSV export, commits it to Git, and synchronizes it to a Google Drive folder.
- In Google Sheets, scientific collaborators enrich the data with domain-specific metadata.
2. Data Transformation for HIBU
- When the team is ready to publish, a second pipeline is manually triggered.
- It fetches the curated CSV from Google Drive and merges it with Paperpile metadata.
- The combined data is transformed into JSON objects, conforming to HIBU’s flexible index schema based on naming conventions (e.g., text, dates, multilingual fields).
- The resulting JSON becomes the search index input for HIBU.
3. Quality Assurance
- A third pipeline runs automated tests whenever code changes occur.
- It validates syntax, checks for duplicates, and ensures backward compatibility with data formats.
Key Benefits of the LIGHTS Research Architecture
- Version-Controlled Collaboration – Every artifact (BibTeX, CSV, JSON) is tracked in Git without researchers having to handle Git directly.
- Automated Validation – Pipelines detect structural or semantic issues early.
- Rapid Deployment – New or corrected records are integrated into HIBU within minutes.
- Low Maintenance – Built entirely from existing cloud tools and open standards.
Component Schema Overview
+----------------+
| Paperpile |
| (References) |
+--------+-------+
|
v
+---------+----------+
| GitHub Repository |
| (Artifacts, CI/CD)|
+----+-------+-------+
| |
(Pipeline 1) (Pipeline 3)
| v
| +-----------+
| | Tests |
| | Validation|
| +-----------+
v
+--------+---------+
| Google Drive/ |
| Google Sheets |
+--------+---------+
|
(Pipeline 2)
v
+----+----+
| JSON |
| (HIBU |
| Input) |
+----+----+
|
v
+----+----+
| HIBU |
| Search |
+---------+
Conclusion
LIGHTS showcases how lean, automated data pipelines can empower international research projects. By combining common tools such as GitHub, Paperpile, and Google Sheets with Karakun’s HIBU platform, the team created a robust, low-cost ecosystem that turns methodological research into a truly global, living collaboration.
Let’s Connect
Want to learn how your organization can build similar intelligent data workflows? Visit karakun.com or reach out to the HIBU team.