Katie Williams
answered on 16 Nov 2022:
last edited 16 Nov 2022 12:30 pm
The data we deal with in genetics uses different file formats to any other data! It’s my job to understand all the different types of genetic data files we have and what they mean.
It isn’t possible to read your whole DNA in one go, so instead, DNA is split up into lots of shorter pieces which can then be read separately. One of the data files we deal with in genetics are called FASTQ files, which store all these little sections of DNA we can read from a sample. However, this data isn’t very useful to us, as what you actually want to know is if the DNA has any mutations. We have code that we run that will put all the pieces of DNA in the FASTQ file back together and match them up, to then check for any mutations.
—————————————————
In terms of programming languages that I code in, Python and bash are the main languages used in my department. I also know R and MATLAB from university, as well as some frontend (e.g. html) as I’ve had to make a few websites.
In terms of devices that I code on, at work I have an ubuntu computer for coding, and I also have a personal windows laptop which I can use when I’m not in the office.
Comments