Long-read DNA analysis can give rise to errors

New methods that can read lengthy sections of genetic material – categorised by a series of letters – are up to 99.8 per cent accurate, however, in a genome of more than 3 billion letters, this may equate to millions of mistakes in the results. Experts from The Roslin Institute led by Professor Mick Watson examined three recent studies reporting human genome sequences from long-read technologies. They found that the data contained thousands of errors even after corrective software was used, leading them to conclude that data produced by these technologies should be interpreted with caution, as it may create problems for analysing genetic information from people and animals. For example, these errors may falsely indicate that an individual has a genetic difference that heightens their risk of a particular disease. Such mistakes could have major implications if these technologies are used in clinical studies to diagnose patients, the team suggests.
Professor Mick Watson from the Roslin Institute said: "Long-read technologies are incredibly powerful but it is clear that we can't rely on software tools to correct errors in the data – some hands-on expertise may still be required. This is important as we increasingly use genomic technologies to understand the world around us."
The study is published in Nature Biotechnology. The Roslin Institute receives strategic funding from the Biotechnology and Biological Sciences Research Council.