-
Notifications
You must be signed in to change notification settings - Fork 2
Teams & Ideas
Don't wait until the day of the event (July 6) to come up with your idea or organize your team. Hack events are about self-organization whether you decided to join a team or go solo. Use this space to pitch your ideas ahead of time and start forming any teams around those ideas.
Suggested format for those with an idea:
- The idea (No more than 1-2 paragraphs, but link outs to other pages is OK).
- Your name
- What skills you bring to make your idea happen
- What complementary skills you still need in any teammates.
Anyone wanting to join that team should add their name below the idea.
- Idea: Figure Mining & Enrichment Mashup PLOS figures / Classify them into types & enrich/further annotate existing metadata. http://plos.figshare.com/ See if we can extract data from figures (e.g. the coordinates of an x,y plot) and provide that data in a machine-readable form.
Tools/Approaches: OCR, Machine Learning, Supervised machine learning, broad metadata ontology
I also have 4 million unique DOI's from Citeulike that I'd like to explore, classify by publisher, journal etc (not sure if this is relevant to the hack day aims but I'll just throw it out there, it's an interesting chunk of data...)
-
Ross Mounce, Community Coordinator for Open Science at the Open Knowledge Foundation
-
Skills: enthusiasm
-
Need team mates!
--
- No idea if this is in the remit of the event, or if it is too big / ambitious (I'm really not sure what to expect on the day!) but I'm going to throw it in and see what people think!
Mendelay is a great tool for organising papers / articles / conference proceedings for accademic work. However, for discovery of new articles it could do more. Listening to Spotify Radio one day, it occured to me, can the same algorithms used by Spotify and its like (last.fm, Pandora, TasteKid, etc.) use to find new music be used to discover new research articles? Can we use some technique (e..g multivariate classifier, SVM, etc.), to learn associations between articles and use these to recommend articles to users not currently in their Mendelay account. Mendelay has over 2m accounts from which associations between articles be derived. Citation relationships, which maybe can be pulled from databases such as PubMed, could also be used. The matching algorithm may also be restricted to consider only one article or group of articles if the user wants to find something on a specific topic.
As a researcher, I randomly find articles that are highly relevant to me and wish I found earlier but didn't because I was using the wrong search terms or looking in the wrong databases / journals. Such a tool would increase researchers' exposure to the latest trends in their research field.
-
Mark Drakesmith, a post-doctoral neuroscientist at Cardiff University
-
Some programming skills (matlab, python and a bit of c++) but no experience of 'hacking', handling databases, etc. A keenness to do something outside my comfort zone!
-
Anyone who is interested! Particularly people with more knowledge or experience of accessing / using this type of data.
Georg Walther: Love the idea. We could start by fingerprinting the abstract and / or main text (if available) of articles. Maybe one way of making a start would be to use a library such as http://nltk.org/ to parse the corresponding bodies of text and count the occurrence of all / some words as a fingerprint. This would probably require storing these fingerprints for later queries. Your idea also seems to be of general interest to Mendeley: http://krisjack.wordpress.com/2013/05/30/workshop-on-academic-industrial-collaborations-for-recommender-systems/