Final Review

On the first day in this class, we became familiar with Python and downloaded Jupyter Python along with installing several packages for coding. We learned the basics of what cancer is, as well as the different types of cancer. We created GitHub repositories for our personal blogs and learned how to upload posts to record what we did in each session. We each then researched a type of cancer that was of interest to us (I researched Leukemia and used data on Acute Myeloid Leukemia in Jupyter Notebooks). Later, it was interesting to hear more about the Human Genome Project, sequencing the nucleotide bases of the entire genome in the human DNA, as well as how it became easier and cheaper over time, as technology developed. We began learning how to plot math functions and make pie charts from data in Jupyter Notebook. Making more advanced plots in Jupyter Notebook, such as the violin plot, was confusing at first, but it helped a lot with understanding the Python coding language more. We used data of integrin genes for our plots. Then, we discussed AI’s importance in the medical field. We learned about machine learning and the different measurements of model accuracy, like accuracy, precision, F1-score, sensitivity, specificity, and AUROC. The true and false positives as well as true and false negatives were important for these measurements and for confusion matrices. We made AI machine learning models, using logistics regression, and classified different organs based on gene expression. There are three different types of data related to the central dogma: genomic data (involving DNA), transcriptomic data (involving RNA and gene expression), and proteomic data (involving proteins; less often used). We learned about tumor mutations, such as SNPs, which are differences in a single base in the DNA sequence, as well as structural variants such as translocation, duplication, insertion, and deletion on a larger scale (on the chromosome). We were introduced to cancer cell lines, cancer subtypes, and drug dose response curves. Finally, we learned about cancer metastasis, which is how cancer cells can travel from one area of the body to other areas and form metastatic tumors, leading to death. The Kaplan-Meyer plot was a statistical method we learned about and plotted on Jupyter notebook with data from TCGA about survival of different types of cancer like breast cancer and acute myeloid leukemia. Throughout multiple sessions, we learned about different companies, projects, experiments, and technologies that have contributed to advancing research in cancer and how BigData can lead the advancements in the future. From the sessions this summer, I’ve enjoyed seeing how code can create various types of output, and the most interesting part to me was seeing how the split violin plots were created. From the lessons, I found the topic of cancer metastasis the most interestng, particularly the method CellSearch uses with EpCAM ferrofluid to magnetically separate and to enumerate circulating tumor cells (CTCs), as well as identifying CTCs by cytokeratin, CD45, and DAPI.

Cancer Subtypes And Drug Dose Response Curves Jupyter Notebook

Liquid Biopsy for Cancer Metastasis Detections