Assignment #10 - Building your own R package
This week, our objectives were:
- Learn the structure of an R package and the role of the DESCRIPTION file
- Learn the structure of an R package and the role of the DESCRIPTION file
- Practice writing machine-readable metadata (authors, dependencies, versioning)
- Draft a coherent package proposal for your final project
- Publish your package scaffold proposal online (GitHub and blog)
My R package: disordR
- Purpose and scope of the project
- The disordR package is a small toolkit designed to be used for students or researchers studying intrinsically disordered proteins (IDPs). It has a simple pipeline that can be used in class projects or advanced research. First, it will calculate charge-hydropathy metrics (based on the Uversky plot). It will then create a concise output of disorder predictors, and finally, it will map AlphaFold pLDDT to assumed intrinsically disordered regions (IDRs).
- Key functions I plan to implement
- aa_props() - mean hydropathy, net charge, and fraction of charged residues
- uversky_metrics() - hydropathy and charge per residue; classifies IDP or ordered
- uversky_plot() - scatter plot vs Uversky boundaries
- consensus_disorder() - mean consensus or predictor scores
- idr_segments() - IDR intervals
- read_af2() - per residue pLDDT from AlphaFold
- idr_from_plddt() - IDR segments from pLDDT
- How I chose the fields in DESCRIPTION
- Package/Title/Description: I wanted to keep the description simple and easy to understand by highlighting the three main functions (charge-hydropathy, consensus disorder, pLDDT IDRs)
- Version: 0.0.0.9000 is the development version provided by the professor for the class project
- Authors@R: I am listed as the author for accountability, contact, and rights.
- Depends: R (>= 3.1.2) as listed by the professor
- Imports: tibble, dpylr, stringr, readr, ggplot2 for clean data management and simple plot generation. bio3d is listed under "Suggests" due to potential issues with AlphaFold and extended usage or run time. Therefore, the package can still build even if bio3d is not installed.
- License: CC0, since it makes reuse simple
- Encoding: UTF-8, avoids issues with dashes/quotes in metadata
- LazyData: "true" since a small dataset may be included
- Link to GitHub repo for disordR
Comments
Post a Comment