Skip to content

GESIS-Methods-Hub/guidelines

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Methods Hub's Guidelines

Here you will find the guidelines used by Methods Hub.

The Methods Hub aims to provide high-quality and easy-to-use computational methods and tutorials to social scientists. Offering such resources through the Methods Hub makes them directly available to the target audience. However, the Methods Hub only accepts resources that follow the principles of open science, that are available in a format that is accessible for social scientists, and that are relevant for social science research. A special focus of the Methods Hub is on resources that work on digital behavioral data, but also other resources are welcome.

A method for the Methods Hub is a sequence of instructions that a computer should execute to perform a specific task and that is bundled for reusability, as well as its documentation.

A tutorial is an instructional resource that may be used as a part of a self-guided learning process. Tutorials on the Methods Hub should focus on very concrete tasks and offer code that helps researchers to solve the task. This could be via applications of methods that are featured on the Methods Hub, but could also refer to methods publised elsewhere. A tutorial can feature more than one method. Tutorials will be prefaced with what prior knowledge is expected from the user such that the user can judge themselves if they have the required skills to follow the tutorial.

To be included in the Methods Hub, a resource is checked to see if it fulfills the criteria in the publishing checklist below. If you believe your resource meets these criteria, submit it for review on the Methods Hub Portal. More details on documentation and code quality, check the Quality Criteria section of the guidelines.

Publishing checklist

Each method or tutorial submitted to the Methods Hub is checked for compliance with the following criteria before publication.

Openness criteria

  • The method or tutorial is developed in an open-source programming language (e.g., Python or R).
  • The method or tutorial is publicly accessible in a Git repository.
    • If a method, the Git repository has one and only one methods.
  • The method or tutorial is published under an open license.

Scoping criteria

  • The method or tutorial is relevant for the social sciences.

  • The method or tutorial belongs to a relevant task of the Tasks Taxonomy.

    If none of the current tasks in the Tasks Taxonomy fits a method or tutorial, contact us at methodshub@gesis.org to extend the taxonomy.

Quality criteria

Documentation quality criteria

  • The method or tutorial repository contains the necessary files for setting up a binder environment for Methods Hub.

    • The method or tutorial repository contains the configuration files for installing all requirements (e.g., environment.yml, requirements.txt, install.R).
    • The method or tutorial repository contains the postBuild file that facilitates Quarto installation.
    • The binder environment is set up without errors.
  • The method or tutorial repository contains a LICENSE file (corresponding to an open license) at the root level of the repository.

  • The method or tutorial repository contains a CITATION.cff file at the root level of the repository.

  • The method or tutorial repository contains file, selected in the submission form, that follows the structure of the templates.

    If a method, this file must be a Methods Hub friendly README (can be README.me or another file).

    If a tutorial, this file must be the tutorial itself in one of the accepted formats.

    Format File extension Template Notes
    Quarto .qmd tutorial/template.qmd
    Jupyter Notebook Format .ipynb tutorial/template.ipynb Limited to a single programming language.
    R Markdown .rmd If possible, should be ported to Quarto.
    (Pandoc) Markdown .md
  • All examples in the method or tutorial repository can be reproduced with reasonable accuracy using only publicly available resources.

Code quality criteria

The code quality criteria can be skipped for methods for which a paper is published by the following trusted third-party review venues.

You can suggest further venues by mail to the Methods Hub team.

  • The method code contains documentation (comments) for parameters and decisions that allows one to adjust the method.
  • The method code is structured into modules (if need be).

Binder environment

These binder configuration files can be located at the root level or in a directory named .binder or binder. In the following sections, we will assume these files to be located in binder.

Specifically for the Methods Hub, the following files must be available among the binder configuration files:

  1. binder/postBuild file that facilitates Quarto installation. The postBuild can be downloaded from https://methodshub.gesis.org/snippet/postBuild/.
  2. configuration files that record the computational environment, e.g. dependencies. See the following sections on how to create these files for different programming languages.

Python

Create binder/requirements.txt using pip.

python3 -m pip freeze > binder/requirements.txt

The binder/requirements.txt should look like binder-examples/python/requirements.txt.

It is strongly recommended to pin the version of the dependencies.

R

install.packages() or similar commands for installing R packages (e.g. pak::pkg_install(), devtools::install_github()) should not be called from the tutorial source file (e.g. qmd, rmd, or .ipynb).

Instead, create binder/runtime.txt (which contains the current R version and a snapshot date) and binder/install.R.

## Record the current R version and use the current date as the snapshot date
Rscript -e "writeLines(paste0('r-', getRversion(), '-', format(Sys.time(), '%Y-%m-%d')), 'binder/runtime.txt')" 

And add install.packages() calls to binder/install.R. The binder/install.R should look like binder-examples/r/install.R.

Although allowed, there are no need to pin the version with tools such as renv because P3M is used when creating a binder environment. It will install the latest version of R packages according to the snapshot date recorded in runtime.txt.

If there is a need to illustrate the installation process using install.packages() or similar commands for installing R packages, set the code block to eval: false as illustrated in tutorial/template.qmd.

Many languages (conda)

If you use conda to configure your computational environment, create binder/environment.yml with

## Export the current active environment
conda env export > binder/environment.yml

or

## Export a specific environment, e.g. environment-name
conda env export -n environment-name > binder/environment.yml

The binder/environment.yml should look like binder-examples/conda/environment.yml.

It is strongly recommended to pin the version of the dependencies.

Frequently asked questions

  1. What is the Methods Hub?

    The Methods Hub is an infrastructure platform that provides openly accessible, reusable computational methods for working with digital behavioral data in social science research.

  2. Who can submit a method or tutorial to the Methods Hub?

    Researchers, practitioners, and developers in computational social science, computer science, natural language processing and related fields can submit methods or tutorial.

  3. Can I publish my computational method on Methods Hub?

    Yes, only if it is open access and open licensed, and belongs to a relevant task of the Tasks Taxonomy.

  4. How can I increase my chances of getting my method published?

    By providing well written documentation following README template, providing all necessary files and making the code reusable without/with minimal user involvement.

  5. Which programming languages are supported?

    The platform supports only open source programming languages such as Python and R.

  6. Can I publish my method or tutorial using paid API or tool?

    No, the methods or tutorial on Methods Hub must be fully resusable with all resources used by the method including APIs, packages being openly accessible to all.

  7. What if my method is already published in a peer-reviewed journal?

    Methods published in trusted third-party venues with proper documentation and code quality can be submitted directly.

  8. Should I write tutorial about my method?

    Yes, tutorials increase the reach of a method to researchers and practitioners with limited practical experience of artificial intelligence methods. It is therefore, highly advised to write tutorial demonstrating the use of your method to a research question as step-by-step guide.

  9. Can I write tutorial about someone else's method?

    Yes, you can write a tutorial about other developers methods as your contribution. You can also write tutorials about methods not published on Methods Hub but are of interest to the Methods Hub audience.

  10. Where should the method or tutorial code be hosted?

    The method or tutorial must be publicly accessible from a Git repository, this includes GitHub, GitLab and others.

  11. What happens when I submit my method?

    When the method is submitted, it is held for review. During this period the reviewer(s) can add issues to the Git Repository if modifications are needed. Once there is no issue to resolve, the method is published on the portal.

  12. What does it mean that a method is published?

    When a method is published, it appears in the Methods Hub gallery and (from next day) is searchable through GESIS Search.

  13. What are the differences between code processed by knitr and jupyter?

    There is one subtle, but important, difference between the code execution between knitr (the default renderer for R code in quarto) and jupyter. For example, this R code block (see the provided file code_exec.qmd)

    ```{r}
    mean(mtcars$mpg)
    plot(mtcars$mpg, mtcars$wt)
    ```
    

    When rendering this into notebook by [Quarto] using

    quarto render code_exec.qmd --to=ipynb

    All code blocks will be rendered but also will get modified with the plotting line removed. It's not ideal. So, there two ways to fix this:

    Convert it is by using quarto convert instead to generate an empty ipynb.

    quarto convert code_exec.qmd -o code_exec.ipynb

    Or, to split the code block into one line per block. And for the plot code, you must add the execution option. Only in this case, quarto render will not eat the visualization code.

    ```{r}
    mean(mtcars$mpg)
    ```
    
    ```{r}
    #| echo: true
    plot(mtcars$mpg, mtcars$wt)
    ```
    

Contact

Methods Hub Team <methodshub@gesis.org>

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •