Category Archives: Science

Referee request response (decline)

Dear Editor,

Thank you for your invitation to review this manuscript for your journal. Unfortunately, I must decline the invitation given that, as a matter of principle, I do not support or endorse the activities of for-profit scientific journals.

The scientific community has previously offered this industry, free of charge:

  • Conducting all scientific research.
  • Writing all scientific manuscripts.
  • Acting voluntarily in editorial roles.
  • Performing all refereeing.
  • (i.e the entire workload of your organisation, other than hosting the website on which you serve the PDFs).

In exchange, we receive:

  • Massive journal subscription fees.
  • Article download fees.
  • Article publication fees.
  • Intimidation tactics employed against us when we prefer not to be a part of it.
  • Anti-competitive and financially predatory distribution tactics.
  • Institutionalised mandates for the above.

This is not a symbiotic relationship, but a parasitic one, for the larger part financed by the taxpayer, who should rather be financing our research. I can no longer endorse this one-sided relationship, in which for-profit journals effectively tax scientific research, to the tune of billions of dollars annually, often using coercive and intimidatory sales tactics, whilst providing very little or no value in return. This capital is best spent on what it was intended for — scientific research for the benefit of humankind — training students, hiring research staff, financing equipment, travel and infrastructure — to which your organisation contributes nothing whatsoever other than to extort value.

In addition to declining this offer, please for future reference:

  • Remove my name from your referee database.
  • Immediately cease and desist from using intimidatory tactics when I decline to volunteer my labour (which is of very high value) to your pursuit of profit (in exchange for nothing).
  • Hassling me for failing to voluntarily contribute my labour to your revenue-raising is tantamount to harassment and extortion.
  • Do not request that I voluntarily act as your journal editor.
  • Do not work in cahoots with national scientific funding agencies to enforce your own vendor lock in, thereby effectively mandating your own services, which are in fact of very little or no value whatsoever. This in an indirect form of taxation upon scientific research, which I have no interest in paying, and which we should be expected or forced to.
  • I do not intend personally to submit any further manuscripts to your journal for consideration (if my co-authors do, I won’t stand in their way).

Personal note to the Editor: this should not be construed as a personal attack against you, who I absolutely respect, but rather against the industry which is exploiting you in a slave-like work relationship, whilst using you as a conduit to engage me for the same purpose. I write this as an act of solidarity with you, not as a personal attack against you.

We advance human knowledge for the benefit of humanity, and provide it as a gift for all.

Referee 2.

(This post may be freely linked to, reused, or modified without acknowledgement)

ARC Future Fellowship

I'm pleased and honoured to announce that I have just been awarded a prestigious ARC Future Fellowship to conduct a 4 year project into quantum networking and encrypted quantum computation. I will be based at the University of Technology Sydney, where I have received tenure as a Senior Lecturer. Ad astra.

The future of DNA sequencing

DNA sequencing is a field in molecular biology with uses ranging from understanding the genetic basis for cancer, to diagnosing genetic predispositions, to understanding the basic way in which cells function at the molecular level. I recently joined the International Cancer Genome Consortium (ICGC), where we aim to catalogue the genetic makeup of as many different cancer types as possible. To do this we must employ sequencing technologies which allow genomes to be experimentally determined. When the Human Genome Project first sequenced the entire human genome, they employed a technique called Sanger sequencing, which essentially steps through every nucleotide, one after another, and determines whether it is an A, C, G or T - the four nucleotide types from which DNA sequences are constructed.

Unfortunately the Sanger approach is both slow and costly. The Human Genome Project cost around $3b, took many years, and employed hundreds of scientists. With technological improvements this could now be done for several million dollars in a fraction of the time. However, this approach is still too costly to sequence 100's or 1000's of unique genomes.

Recently a different approach has emerged, high throughput sequencing technologies, which allow an entire human genome to be sequenced for $10,000, by a couple of scientists, in under a week. As technology improves it probably isn't unrealistic to expect that in the coming decade this could be done for merely hundreds of dollars. Several competing high throughput technologies have emerged, but they all operate in a similar fashion. Rather than sequencing an entire DNA strand from start to finish, they fragment the DNA into millions of small pieces, called 'reads', which are on the order of 50 nucleotides in length. Each of these fragments is independently sequenced, and can be sequenced in parallel, which can be done with a fraction of the time and cost compared to traditional Sanger sequencing.

So now we've sequenced millions of tiny fragments - what do we do with them? There are two approaches to utilizing this data. The first approach is to map the fragments to a reference genome. Suppose we have the entire human genome, thanks to the Human Genome Project, then we can take each read and look at where is maximally overlaps with the reference. By looking at where the mapped reads sit on the reference genome we can see what the differences are. For example a read might differ from its mapped location on the reference by a few nucleotides. These differing nucleotides are called Single Nucleotide Polymorphisms, or SNPs (pronounced 'snips'), and tell us what's different between you and me. The SNPs, which constitute only a tiny fraction of the genome, give us all sorts of useful information, like predispositions, mutations and genetic traits. The second approach to utilizing reads is to attempt 'de novo assembly'. Here, we take all the reads and look at how they overlap with one another, as opposed to how they overlap with a reference. So if the last few nucleotides of one read overlap with the first few of another, then we can conclude that those pieces fit together. We are essentially left with a huge jigsaw puzzle that we must piece together. The advantage of this approach is that it does not require a reference, so it can be used to sequence large chunks of a genome in the absence of any a priori information about the genome. Both of these approaches are useful. Mapping is useful when we have a reference genome to compare against, while de novo assembly is useful when we don't.

Unfortunately both these approaches have some limitations. In particular, sequenced data from present day sequencing technologies have quite high error rates. This makes it more difficult to map a read against a reference, and it also makes it more difficult to perform de novo assembly since the pieces in the puzzle don't always fit together. So error detection and correction mechanisms have to be employed. Having said that, there are significant improvements being made to current high throughput technologies which are incrementally reducing error rates and increasing throughput, which gives us more pieces in the puzzle, with lower error rates, and therefore a higher likelihood of finding matching pieces.

The future of DNA sequencing is looking very promising. We can do in a week what previously took years, and we can do it with many orders of magnitude reduction in cost. As this technology becomes more widespread I'm sure that economies of scale and technological improvements will continue to put downward pressure on cost and upward pressure on throughput and data quality.

Of course, the accessability of this technology carries with it some significant moral dilemas. Who should be allowed access to this kind of technology? Employers? Health insurance companies? Life insurance agencies? Already there are private companies like 23&me which allow a person's SNPs to be sequenced for a few hundred dollars, revealing a person's predisposition to physical and mental illnesses, racial background, likelihood of being intelligent, or any number of other traits that an unscrupulous employer might be interested in. There is enormous potential for misuse, which in my mind policy makers should consider addressing sooner rather than later.