LARK Lab Manual

This lab manual outlines the values, structure, and research practices that guide work in the LARK Lab. It reflects our problem-driven approach to clinical NLP and AI research, our emphasis on reproducibility and transparency, and our commitment to building tools that operate within real healthcare systems.

While developed as an internal guide for lab members, we share this manual publicly to provide insight into how our lab operates, how we collaborate, and the standards we hold for responsible, impactful research.

This is periodically updated as the lab and its research evolve.

Lab Structure and Organization

The LARK Lab operates as a grant-centered research group, focusing on projects that are either currently funded or strategically developed to secure external funding. This structure ensures that our research remains aligned with real-world clinical needs, scientific rigor, and impactful deliverables. As part of a university hospital environment, the lab collaborates closely with clinicians, healthcare administrators, and informatics teams to access and analyze Electronic Health Record (EHR) data. This unique positioning enables us to develop and evaluate foundational NLP and AI technologies directly within the healthcare ecosystem, ensuring our tools address genuine clinical workflows, support patient care, and can be effectively integrated into operational systems. Our organizational approach emphasizes transparency, reproducibility, and steady progress toward grant milestones, enabling both innovation and reliability across all projects.

Core Principles

1. Problem-Driven and Principle-Driven AI

At LARK, meaningful innovation begins with real problems and strong scientific principles.

We start with concrete clinical or scientific questions, grounding our work in the needs, constraints, and reasoning processes of the real world. Innovation is not about changing architectures for novelty emerges naturally when the problem is deeply understood.

Our approaches follow essential principles: clarity, reproducibility, safety, bias awareness, interpretability, and alignment with real-world clinical use. These principles guide our design, evaluation, and iteration of methods.

In LARK, creativity grows from understanding, and rigor grows from principles.

Problem and principle together drive the AI we build.

2. Small Steps → Reproducible Baselines → Iterative Improvement

Our work emphasizes:

  • Developing and running reproducible baselines
  • Making incremental, measurable progress
  • Documenting what works and what does not
  • Asking questions early rather than working in isolation.

Big leaps are valuable but only when built on a solid, reproducible foundation.

3. Communicate Early, Communicate Often

If you are unsure where to start or if the direction is unclear, ask. Asking questions is a strength, not a weakness.

No one is expected to know everything; LLM and NLP research move too quickly for that to be feasible. What matters most is alignment, clarity, and shared understanding.

4. Transparency Over and Perfection

Small, consistent progress updates are essential.

Visibility helps us support each other, avoid redundant work, and confirms that we are moving in the right direction. It is completely okay if experiments fail or approaches do not work; however, hidden progress slows the entire team.

5. A High-Standards but Supportive Lab

We aim to produce impactful, rigorous, and safe AI research. This requires high standards and a culture grounded in kindness, respect, and collaboration.

We give direct, constructive feedback because we want each other to succeed, not because we judge ability or potential. We push for excellence together, not alone.

6. Growth Mindset and Self-Driven Learning

No one starts as an expert in everything. We work in a field where skills change rapidly. What matters is the willingness to learn, ask questions, and iterate.

Everyone in the lab, including the PI, is learning every day.

Curiosity, not perfection, drives growth.

7. Healthy Boundaries and Professionalism

We respect each other’s time, effort, and well-being. If project directions evolve or expectations shift, we communicate openly so we can adjust together.

Professionalism entails striking a balance between dedication and sustainability, while maintaining a work environment that fosters growth and well-being.

Expectations

All lab members are expected to take primary responsibility for the success of their research projects and their professional development. While mentorship and support are provided, individuals are responsible for managing their work, meeting agreed-upon milestones, and communicating progress or challenges in a timely manner. As members of the lab, everyone is expected to participate actively in the research community and contribute to a collaborative, respectful, and supportive team environment.

To facilitate collaboration and communication across the group, lab members are generally expected to be available during core weekday working hours of 10:00 AM to 4:00 PM. Meetings and group activities will be scheduled within these hours whenever possible to accommodate shared discussion and teamwork.

The lab operates in a hybrid format. All full-time team members are expected to work in the office at least three days per week. Student interns are not expected to work on-site during the academic semester; however, during the summer or extended academic break periods, they are expected to come into the office once per week, typically on the day of the weekly lab meeting, to support collaboration and engagement with the lab.

Roles and Responsibilities

This section provides a clear overview of roles and responsibilities within the lab. The descriptions below outline core expectations for each position and are used to guide onboarding, support professional development, and inform annual performance reviews and evaluations of success.

Principal Investigator

The PI provides the scientific vision, strategic leadership, and overall oversight for the lab. The PI is responsible for defining the research agenda, ensuring scientific rigor, and guiding the lab’s work in alignment with institutional priorities and funding requirements.

They lead the development of new research initiatives, oversee grant acquisition and management, and ensure that projects are designed, conducted, and reported in accordance with ethical, regulatory, and scientific standards. They are ultimately accountable for compliance with institutional policies, sponsor requirements, and data governance practices.

The PI mentors and supervises lab personnel, supports professional development, and fosters a collaborative, inclusive, and productive research environment. The PI represents the lab in internal and external collaborations, disseminates research findings, and ensures the long-term sustainability and impact of the lab’s scientific mission.

Lab Manager

The Lab Manager supports the overall functioning of the lab by coordinating day-to-day operations, administrative activities, and organizational workflows that enable high-quality research. This role works closely with the PI and research team to ensure projects progress smoothly and lab processes remain efficient and well-documented.

They assist with research coordination by tracking timelines, organizing meetings, and supporting communication across lab members and collaborators. They help maintain compliance with institutional policies by organizing required training records, documentation, and operational materials, and by supporting established lab protocols.

They oversee general lab operations, including scheduling, procurement, onboarding support for new team members, and maintenance of shared resources such as documentation systems and data organization structures. Through these responsibilities, the Lab Manager contributes to a professional, collaborative, and well-organized lab environment that supports the lab’s scientific mission.

Postdoctoral Fellows (Postdocs)

Postdocs are advanced researchers who play a key role in driving the lab’s scientific work. They conduct independent and collaborative research aligned with the lab’s goals, applying advanced computational, statistical, and methodological approaches to biomedical and clinical research questions.

This role leads research projects from design through analysis and dissemination, including developing methods, interpreting results, and contributing to publications and presentations. They work closely with the PI to shape research directions, contribute to grant-related activities, and ensure scientific rigor and reproducibility.

Postdocs mentor junior lab members, support collaborative projects across disciplines, and contribute to a professional and productive research environment. They are expected to communicate progress clearly, uphold ethical and compliance standards, and actively support the lab’s research mission.

Research Assistants (RAs) / Student Interns

RAs support ongoing research projects while gaining hands-on experience in biomedical and computational research. They work under supervision to assist with data preparation, basic analyses, and project-related tasks that contribute to the lab’s research objectives.

These roles involve supporting data organization, documentation, and technical workflows, as well as assisting with testing, quality checks, and routine research activities. RAs closely work with postdoctoral fellows and other team members to help meet project milestones.

RAs are expected to participate in lab meetings and training activities, follow lab protocols and data governance standards, and maintain professionalism and academic integrity. Through these responsibilities, they build foundational research skills while contributing to a collaborative lab environment.

Deadlines

The lab aims to submit papers to major NLP and AI conferences of interest each year, depending on project scope and readiness.

The primary conferences we target include:

  • ACL — Association for Computational Linguistics (DDL: February)
  • EMNLP — Empirical Methods in Natural Language Processing (DDL: May–June)
  • NAACL — North American Chapter of the ACL (DDL: December)
  • EACL — European Chapter of the ACL (DDL: October)
  • AACL — Asian Chapter of the ACL (DDL: May)
  • COLING — International Conference on Computational Linguistics (DDL: After EMNLP)
  • Conference on Language Modeling (DDL: March; not part of ACL)

All paper submission deadlines are tracked on the shared team calendar and should be treated as hard deadlines.

In addition to conference submissions, the lab has grant deadlines that must be met, as these grants fund the lab’s research activities. Grant-related deadlines take priority and require advance planning and coordination.

Lab members are expected to:

  • Monitor the shared calendar for upcoming conference and grant deadlines
  • Plan project timelines accordingly
  • Plan vacations around deadlines
  • Avoid scheduling vacations immediately before a deadline unless the submission is already ready or sufficient progress has been made to ensure on-time submission

When possible, vacations should be planned after submissions, not before.

If you are uncertain whether planned time off may conflict with a conference or grant deadline, discuss it with Yanjun or the project lead in advance.

Morgan Pena, MPH
Morgan Pena, MPH
Lab Research Coordinator; Project Manager
Yanjun Gao, PhD
Yanjun Gao, PhD
Assistant Professor