Minutes: Software Engineering Assessment Discussion 2025

by Alex Johnson 57 views

Discussion Category: RebeccaSizer, Software_Engineering_Assessment_2025_AR_RW_RS

Date: 17-11-2025

Attendees:

  • RS and AR (10:00 - 11:00)
  • RW and AR (11:00 - 12:00)

Time: 10:00 - 11:00

Attendees: RS and AR

Aims and Objectives:

  • Review HGVS_converter function to decide what the outputs should be and how to implement the function in the database scripts.

    • We decided that the output should be returned as a tuple for each variant as opposed to a dictionary. This will make each variant easier to iterate through and each value in the tuple easily callable. When designing software, choosing the right data structure is crucial for efficiency and maintainability. In this case, opting for a tuple over a dictionary for the HGVS_converter function output offers several advantages. Tuples, being ordered and immutable, ensure that the sequence of variant data remains consistent, which is particularly useful when iterating through the results. This immutability also helps prevent accidental modifications, enhancing data integrity. Furthermore, the easy callability of values within a tuple simplifies the code, making it more readable and less prone to errors. While dictionaries provide key-value pairs for more descriptive data access, the simplicity and directness of tuples align well with the need for straightforward iteration and predictable data handling in this specific application. This decision reflects a thoughtful consideration of the project's requirements, prioritizing ease of use and data consistency in the software's design.
    • We considered if it would be better to get the HGNC_ID and gene symbol from the Variant Validator output rather than from Genbank using the Entrez output. We decided to opt for the Variant Validator output.The decision to source HGNC_ID and gene symbols from the Variant Validator output instead of Genbank via Entrez highlights a strategic choice aimed at optimizing data accuracy and workflow efficiency. Variant Validator, a tool specifically designed for variant annotation, likely provides more direct and curated information relevant to variant identification. This approach bypasses the potential complexities and data heterogeneity associated with querying Genbank, a general-purpose genomic database. By leveraging Variant Validator, the team can streamline the process of obtaining essential gene information, reducing the risk of errors and inconsistencies that might arise from parsing data from a broader database like Genbank. This targeted approach ensures that the software engineering assessment benefits from reliable and context-specific data, contributing to the overall robustness and accuracy of the project.
  • Create a Sprint 3 branch from which we should develop from in the future.

    • Pull from database
    • Pull from flask_web_framework_RS
    • Pull from clivar_api_request_AR
    • Pull from clivar_api_request_RW
    • Pull web_query_app_RW

Time: 11:00 - 12:00

Attendees: RW and AR

Aims and Objectives:

  • Send Rachel the headers for each table in the database so that she can update the headers that the user sees on the webpage. (Arjun)
  • Flask App currently searches by gene, patient ID, variant. Free text search box and dropdown menu provided. Due to incorporate a download to Excel button. The current features of the Flask App, including search functionalities by gene, patient ID, and variant, along with a free text search box and dropdown menu, demonstrate a user-centric approach to data retrieval. These features collectively enhance the app's usability by providing multiple avenues for users to find the information they need quickly and efficiently. The planned addition of a download to Excel button is a practical enhancement that addresses a common user need: the ability to export data for further analysis or reporting. By offering this functionality, the Flask App not only facilitates data access but also supports downstream data manipulation and integration with other tools. This holistic approach to data handling underscores the app's commitment to providing a comprehensive solution for users working with complex datasets.
  • Look at ClinVar script and determine if it can find the Conditions for each variant. (AR)
  • See if ClinVar API can be queried to find the Condition. (RW)
  • ClinVar currently does not work correctly. The ClinVar script needs to be developed so that it can take the NC_ number including the g. Number and output the correct information from ClinVar. The identified issue with ClinVar, where the script does not correctly process NC_ numbers including the g. Number, is a critical area for development. ClinVar is an essential resource for variant interpretation, providing valuable information on the clinical significance of genetic variations. To ensure accurate and reliable results, the ClinVar script must be enhanced to correctly parse and interpret the NC_ number format. This involves refining the script's logic to extract the relevant information from ClinVar's database, enabling it to provide accurate clinical annotations for each variant. Addressing this issue will significantly improve the utility of the software engineering assessment, ensuring that users can confidently rely on the clinical information provided by ClinVar for their research and decision-making processes. By prioritizing this development, the team demonstrates a commitment to data accuracy and the provision of high-quality resources for the broader scientific community.
  • RW to code review HGVS_convertion script as AR and RS worked on it.

Future:

  • Discuss moving Monday meetings to 9:00 - 10:00. Considering moving Monday meetings to an earlier time slot, specifically from 10:00 to 9:00 AM, reflects a proactive approach to optimizing team productivity and accommodating individual schedules. This potential adjustment acknowledges the importance of finding a meeting time that aligns well with team members' peak performance hours and minimizes disruptions to their workflow. By starting the week with a focused and efficient meeting, the team can set a positive tone for the week ahead and ensure that everyone is aligned on priorities and goals. Furthermore, an earlier start time may also facilitate better communication and collaboration across different time zones, if applicable. This willingness to re-evaluate and adapt the meeting schedule demonstrates a commitment to creating a supportive and productive work environment for all team members.
  • Consider asking User if they want gnomAD scores. The idea of incorporating gnomAD scores as a user-selectable option highlights a forward-thinking approach to enhancing the software's analytical capabilities. gnomAD (Genome Aggregation Database) scores provide valuable insights into the frequency of genetic variants in large populations, aiding in the assessment of variant pathogenicity. By offering users the choice to include gnomAD scores in their analyses, the software can empower them to make more informed decisions about the clinical relevance of identified variants. This feature would be particularly useful for distinguishing between rare, potentially disease-causing variants and common benign variants found in the general population. By providing access to this critical information, the software can significantly improve the accuracy and reliability of variant interpretation, supporting researchers and clinicians in their efforts to understand the genetic basis of disease.
  • Give the user the option of downloading the latest version of ClinVar? Providing users with the option to download the latest version of ClinVar underscores a commitment to data transparency and empowering users with the most up-to-date information available. ClinVar is a dynamic database that is continuously updated with new variant interpretations and clinical annotations. By allowing users to download the latest version, the software ensures that they have access to the most current and comprehensive information for their analyses. This feature promotes data accuracy and enables users to stay abreast of the latest developments in variant interpretation. Moreover, it fosters trust and confidence in the software, demonstrating a dedication to providing users with the best possible resources for their research and clinical decision-making.

Learn more about software engineering assessment on this trusted website: IEEE Software Engineering Body of Knowledge (SWEBOK)