Screenager: tempos de triagem no bioRxiv

cupom com desconto - o melhor site de cupom de desconto cupomcomdesconto.com.br


[Esteartigofoipublicadopelaprimeiravezem[Thisarticlewasfirstpublishedon Rstats – quantixed, e gentilmente contribuiu para os R-blogueiros]. (Você pode relatar um problema sobre o conteúdo desta página aqui)


Deseja compartilhar seu conteúdo com R-blogueiros? clique aqui se você tiver um blog ou aqui se não tiver.

Quando uma pré-impressão é carregada no bioRxiv, ela passa por uma triagem antes de aparecer on-line. Quanto tempo leva para os Afiliados exibirem as pré-impressões no bioRxiv?

tl; dr Eu usei R para observar os tempos de triagem do bioRxiv. Embora o bioRxiv tenha se expandido enormemente, a triagem ocorre rapidamente (em cerca de 24 h).

Sou um afiliado da bioRxiv – uma das pessoas que faz a triagem. As pré-impressões aguardam na fila para serem rastreadas. Ao longo dos anos, vi a fila típica ficar cada vez mais longa. Nos primeiros dias, a fila era talvez 10 pré-impressões. Nos dias de hoje, geralmente são mais de 100.

É um esforço de equipe e mais Afiliados foram recrutados ao longo do tempo. No entanto, muitas vezes me pergunto como estamos indo. Minha impressão é que sempre há muitos trabalhos de Neurociência e Bioinformática em fila. Alguma área temática é negligenciada? Em caso afirmativo, os Afiliados nessas áreas devem ser recrutados especificamente?

Para analisar essas perguntas, usei esse maravilhoso cliente R para a API bioRxiv, escrita por Nicholas Fraser.

Para configurar:

devtools::install_github("nicholasmfraser/rbiorxiv")
# load packages
library(rbiorxiv)
library(tidyverse)
library(gridExtra)

# make directory for output if it doesn't exist
if (dir.exists("./output")==FALSE) dir.create("./output")
if (dir.exists("./output/plots")==FALSE) dir.create("./output/plots")

Use o cliente R para obter um quadro de dados de pré-impressões carregado em 2020.

data 

We only want to look at new preprints (Version 1) and not revisions, so let’s filter for that. Then, we’ll take advantage of bioRxiv’s new style DOIs to find the “submission date”.

data 

We now have a column called ‘days’ that shows the time in days from “submission” to “publication”. We will use this as a measure of screening time. Note: this is imperfect because the submission date is when an author begins uploading their preprint (they could take several days to do this) and not when it actually gets submitted to bioRxiv.

Let’s look at the screening time per subject area.

p1 
Histogram of screening times per subject

I was surprised to see that, with the exception of “Scientific Communication and Education”, the screening times were pretty constant across categories.

cupom com desconto - o melhor site de cupom de desconto cupomcomdesconto.com.br

The subject areas on bioRxiv are not equal in size. Look at the numbers on the axes for Zoology and for Neuroscience to get a feel for the difference. The histogram view conceals these differences.

Next, we can calculate the average screening time and see if the busiest categories suffer delayed screening.

df1 

And then make some bar charts to look at the data.

p3 

The average screening time is 1 day or less. Neuroscience, microbiology and bioinformatics (the biggest categories) have similar screening delays to less busy categories. So, assuming that Affiliates screen on the basis of expertise, the pool is either enriched for these popular areas, or those affiliates are more busy!

The longest lag is for “Scientific Communication and Education”, which is a very small category. Assuming the authors take a similar time to upload these manuscripts, I guess the Affiliates tend to screen these preprints as a lower priority. These papers do tend to be a bit different from other research papers and have separate screening criteria. Anyway, they still get screened in just over 2 days, which is still impressive.

I was pleased to see “Cell Biology” had the shortest screening time (around half a day)!

Conclusion

Even though my impression was that Bioinformatics and Neuroscience papers linger in the queue, this is not actually the case. There’s likely more of them in the queue because there are more of them, period.

The bioRxiv team have done a great job in maintaining a pool of Affiliates that can screen the huge number of preprints that are uploaded.

The post title comes from “Screenager” by Muse from their Origin of Symmetry album.



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...



cupom com desconto - o melhor site de cupom de desconto cupomcomdesconto.com.br
Leia Também  Novos padrões do xgboost | R-bloggers