Skip to content
Martin Fenner edited this page Jun 22, 2013 · 36 revisions

Don't wait until the day of the event (July 6) to come up with your idea or organize your team. Hack events are about self-organization whether you decided to join a team or go solo. Use this space to pitch your ideas ahead of time and start forming any teams around those ideas.

Suggested format for those with an idea:

  1. The idea (No more than 1-2 paragraphs, but link outs to other pages is OK).
  2. Your name
  3. What skills you bring to make your idea happen
  4. What complementary skills you still need in any teammates.

Anyone wanting to join that team should add their name below the idea.


Teams

Figure Mining & Enrichment

  1. Idea: Figure Mining & Enrichment Mashup PLOS figures / Classify them into types & enrich/further annotate existing metadata. http://plos.figshare.com/ See if we can extract data from figures (e.g. the coordinates of an x,y plot) and provide that data in a machine-readable form.

Tools/Approaches: OCR, Machine Learning, Supervised machine learning, broad metadata ontology

I also have 4 million unique DOI's from Citeulike that I'd like to explore, classify by publisher, journal etc (not sure if this is relevant to the hack day aims but I'll just throw it out there, it's an interesting chunk of data...)

  1. Ross Mounce, Community Coordinator for Open Science at the Open Knowledge Foundation

  2. Skills: enthusiasm

  3. Need team mates!

--

Suggestion engine for relevant / interesting scholarly articles

  1. No idea if this is in the remit of the event, or if it is too big / ambitious (I'm really not sure what to expect on the day!) but I'm going to throw it in and see what people think!

Mendelay is a great tool for organising papers / articles / conference proceedings for accademic work. However, for discovery of new articles it could do more. Listening to Spotify Radio one day, it occured to me, can the same algorithms used by Spotify and its like (last.fm, Pandora, TasteKid, etc.) use to find new music be used to discover new research articles? Can we use some technique (e..g multivariate classifier, SVM, etc.), to learn associations between articles and use these to recommend articles to users not currently in their Mendelay account. Mendelay has over 2m accounts from which associations between articles be derived. Citation relationships, which maybe can be pulled from databases such as PubMed, could also be used. The matching algorithm may also be restricted to consider only one article or group of articles if the user wants to find something on a specific topic.

As a researcher, I randomly find articles that are highly relevant to me and wish I found earlier but didn't because I was using the wrong search terms or looking in the wrong databases / journals. Such a tool would increase researchers' exposure to the latest trends in their research field.

  1. Mark Drakesmith, a post-doctoral neuroscientist at Cardiff University

  2. Some programming skills (matlab, python and a bit of c++) but no experience of 'hacking', handling databases, etc. A keenness to do something outside my comfort zone!

  3. Anyone who is interested! Particularly people with more knowledge or experience of accessing / using this type of data.

Georg Walther:

Love the idea. We could start by fingerprinting the abstract and / or main text (if available) of articles. Maybe one way of making a start would be to use a library such as http://nltk.org/ to parse the corresponding bodies of text and count the occurrence of all / some words as a fingerprint. This would probably require storing these fingerprints for later queries. Your idea also seems to be of general interest to Mendeley: http://krisjack.wordpress.com/2013/05/30/workshop-on-academic-industrial-collaborations-for-recommender-systems/

Just a bit of fun ... probably nothing useful at this stage.

PDFs can probably be handled with pdftotext and pdfinfo (both Linux command line tools) which seem to be used by Zotero for meta data extraction.

Browser Plugin(s) for PubPeer.com

  1. PubPeer.com seems to be a new and promising platform for post-publication review / discussion of scientific articles.

Let's have a stab at developing a browser plugin -- assuming we can get a hold of pubpeer's API.

Ideas for the plugin:

  • add a browser button / GUI element for the plugin
  • when the user visits the abstract or full view of an article, detect the article's DOI and query pubpeer for existing comments -- to fetch DOIs or other identifiers we can probably make use of Zotero translators (if I understand these correctly)
  • if comments exist alert the user to their existence (non-intrusively)
  • give the user the ability to easily jump directly to the corresponding pubpeer page of the viewed article
  • (more advanced?) open a new panel / window that shows the corresponding pubpeer comments and allows the user to leave comments while keeping the browser window pointed at the article
  1. Georg Walther

  2. Python, C; hardly any experience with web services nor browser plugins but keen to learn

  3. Browser plugin people; JavaScript; XUL (XML)

Author contributions in PLOS papers

All PLOS papers (currently about 80,000) include author contributions in a format similar to this:

Conceived and designed the experiments: HQ JKC AR NH. 
Performed the experiments: HQ JKC AR MP. 
Analyzed the data: HQ JKC AR MP NH. 
Contributed reagents/materials/analysis tools: CH. 
Wrote the paper: HQ JKC AR NH.

I want to use the PLOS Search API to do a systematic analysis of these author contributions, e.g. how many times the first author was involved in writing the paper, or how often we have co-authors who appear only in the "contributed reagents/materials/analysis tools" section.

This idea is also an exercise in searching the PLOS CC-BY content for machine-readable information, and in using R for data analysis and visualization. I would be happy to introduce people to R and the rplos package created by rOpenSci that makes working with the PLOS Search API much easier. We will do some nice visualizations with the results, and will write a report in markdown (using the R knitr package) that can be posted to the hack4ac website.

  1. Martin Fenner, technical lead of the PLOS article-level metrics project
  2. Experience in R, Ruby, Javascript, PHP
  3. People with skills in R or interested in learning R.
Clone this wiki locally