Table of Contents
-
This project includes several useful examples of how to use the scraper to get customized data, but is primarily intended to provide a framework for programmers to be able to create their own objects and methods to scrape the data they want.
-
The project was created to be used by programmers and provide a replacement for the no longer maintained OsrsBox project.
-
To be able to efficiently create your own methods to scrape the data you want you will need a basic understanding of Lua, Kotlin, and MediaWiki.
OsrsWiki (OsrsWiki.kt)
The OsrsWiki class is the main class of the project. It provides methods to scrape data from the OSRS Wiki.
The OsrsWiki class is the main class of the project. It provides methods to scrape data from the OSRS Wiki.
val wiki = OsrsWiki.builder()
.withCookieManager(CookieManager())
.withProxy(Proxy())
.withUserAgent("Custom User Agent")
.withScribuntoSessionCount(10)
.build()
-
Optionally set a custom cookie manager.
-
.withCookieManager( CookieManager() )
-
-
Optionally set a custom proxy.
-
.withProxy( Proxy() )
-
-
Optionally set a custom user agent.
-
.withUserAgent( "Custom User Agent" )
-
-
Optionally set the default number of Scribunto sessions used for bulk Scribunto requests.
-
.withScribuntoSessionCount( 10 )
-
-
Get page titles from Item IDs:
-
wiki.getItemPageTitlesFromIds(11832, 11834, 11836) // ["Bandos chestplate", "Bandos tassets", "Bandos boots"]
-
-
Get page titles from NPC IDs:
-
wiki.getNpcPageTitlesFromIds(1399, 2639) // ["King Roald", "Robert The Strong"]
-
-
Get all Item titles:
-
wiki.getAllItemTitles() // ["Abyssal whip", "Abyssal bludgeon", "Abyssal dagger", ...]
-
-
Get all NPC titles:
-
wiki.getAllNpcTitles() // ["Abyssal demon", "Abyssal leech", "Abyssal lurker", ...]
-
-
Get ItemDetails by name(s) or all:
-
wiki.getItemDetails("Bandos chestplate", "Bandos tassets", "Bandos boots") // Map<String, List<ItemDetails>> wiki.getAllItemDetails() // Map<String, List<ItemDetails>>
-
-
Get NpcDetails by name(s) or all:
-
wiki.getNpcDetails("King Roald", "Robert The Strong") // Map<String, List<NpcDetails>> wiki.getAllNpcDetails() // Map<String, List<NpcDetails>>
-
-
Get MonsterDetails by name(s) or all:
-
wiki.getMonsterDetails("Abyssal demon", "Abyssal leech", "Abyssal lurker") // Map<String, List<MonsterDetails>> wiki.getAllMonsterDetails() // Map<String, List<MonsterDetails>>
-
-
Get QuestRequirement's for all quests:
-
wiki.getQuestRequirements() // Map<String, List<QuestRequirement>>
-
-
Get VarbitDetails for all varbits on the Wiki:
-
wiki.getVarbitDetails() // Map<Int, VarbitDetails>
-
-
Get ProductionDetails for all items with production data:
-
wiki.getProductionDetails() // Map<String, ProductionDetails>
-
-
Get ItemPrice for Item ID:
-
wiki.getItemPrice(11832) // WikiItemPrice?
-
-
Get all LocLineDetails:
-
wiki.getAllLocLineDetails() // Map<String, List<LocLineDetails>>
-
-
Get Slayer Monsters and their Task IDs:
-
wiki.getSlayerMonstersAndTaskIds() // Map<String, Int>
-
-
Get Slayer Masters that assign task:
-
wiki.getSlayerMastersThatAssign("Ghouls") // ["Mazchna", "Vannaka"]
-
-
Get all titles in a category:
-
wiki.getTitlesInCategory("Items", "Monsters") // List<String>
-
-
Get all titles using any (one or more) of the specified template(s):
-
wiki.getAllTitlesUsingTemplate("Infobox Item", "Infobox Bonuses") // List<String>
-
-
Get all titles using all of the specified template(s):
-
wiki.getAllTitlesUsingTheseTemplates("Infobox Item", "Infobox Bonuses") // List<String>
-
-
Get all template names present on a page:
-
wiki.getNamesOfTemplatesOnPage("Baby chinchompa") // List<String>
-
-
Get all uses of a template across the entire Wiki:
-
wiki.getAllTemplateUses("Infobox Item") // Map<String, List<JsonObject>>
-
-
Get all data for specified template(s) on a page:
-
wiki.getTemplateDataOnPage("Baby chinchompa", "Infobox Item", "Infobox Bonuses") // Map<String, List<JsonObject>>
-
-
Get all data for all templates on a page:
-
wiki.getAllTemplateDataOnPage("Baby chinchompa") // Map<String, List<JsonObject>>
-
-
Get all titles in categories with revisions since a specified date:
-
val threeDaysAgo = Date.from(Instant.now().minus(3, ChronoUnit.DAYS)) wiki.getAllTitlesWithRevisionsSince(threeDaysAgo, "Items") // List<String>
-
-
Get last revision timestamp for title(s):
-
wiki.getLastRevisionTimestamp("Baby chinchompa", "Black chinchompa") // Map<String, String> wiki.getLastRevisionTimestamp(listOf("Baby chinchompa", "Black chinchompa")) // Map<String, String>
-
-
Dynamic Page List (DPL3) query:
-
val query = mapOf( "category" to "Items", "count" to 10, "include" to "{Infobox Item}", ) val response = wiki.dplAsk(query) // JsonElement
- Further explanation on DPL3 queries can be found below in ScribuntoSession and DPL3 Documentation
-
-
MediaWiki Semantic Search:
-
val query = listOf( "[[Location JSON::+]]", "?#-=title", "?Production JSON", ) val response = wiki.smwAsk(query) // JsonElement
-
Scribunto Session (ScribuntoSession.kt)
The Scribunto Session connects to the MediaWiki API and allows for the execution of Lua scripts on the Wiki.
The Scribunto Session connects to the MediaWiki API and allows for the execution of Lua scripts on the Wiki.
- Executing custom Lua scripts on the Wiki.
- Loading data from the Wiki Lua modules.
- Using the DPL3 query language to query the Wiki.
- Controlling the format and the volume of the data returned by the Wiki.
val session = wiki.createScribuntoSession {
withoutDefaultCode()
withWikiModule("ModuleName")
withCode("print('Hello World')")
withCode {
/* Use the Lua Builder */
}
}
- Optionally disable the default code included in the session, you can add your own code with the
withCode
function.-
.withoutDefaultCode()
-
- Optionally set the module the session will use, by default this is
"Var"
, for no particular reason other than being a small module.-
.withWikiModule("ModuleName")
-
- Optionally add code to persist in the session.
-
.withCode("print('Hello World')") .withCode { /* Use the Lua Builder */ }
- See LuaBuilder.kt for more information on the Lua Builder.
-
-
Send a request with a string of Lua code:
-
session.sendRequest("print(\"Hello World\"") // Pair<Boolean, JsonElement>
-
-
Send a request with a LuaBuilder instance:
-
session.sendRequest { /* Use the Lua Builder */ } // Pair<Boolean, JsonElement>
-
-
Send a request with the first parameter being
true
and it will automatically refresh the Scribunto Session:-
session.sendRequest(true, "print(\"Hello World\"") // Pair<Boolean, JsonElement> session.sendRequest(true) { /* Use the Lua Builder */ } // Pair<Boolean, JsonElement>
-
-
The return value from the
sendRequest
function is aPair<Boolean, JsonElement>
where the first value is whether or not the request was successful and the second value is the response from the Wikiprint
return field. -
To get a value back from the wiki use the Lua
print
function. -
The default Lua code provided includes a method to return values called
printReturn
and will return the input value as a JSON string.-
{ "success": true, "message": "Only present if success is false", "printReturn": "{\"json\": \"value\"}" }
-
-
The default code sent to the Wiki can be found here: Scribunto.lua
-
The session uses the same
Session ID
for each request. The wiki will continue to add the code in the requests to the session until the session is refreshed or the session expires. -
The session will automatically refresh if the session expires or if the session is refreshed manually.
-
If the session has failed too many requests since the last refresh it will automatically refresh.
-
The session can be refreshed manually:
-
session.refresh()
-
Lua Builder (LuaBuilder.kt)
This is not intended to be a full Lua interpreter or converter, but rather a tool to make it easier to create Lua code.
-
You can create a LuaScope instance with the
lua
function:-
lua { /* Use the Lua Builder */ }
-
-
The supported value types are:
String
Number
Date
Boolean
Map<*, *>
(*
values may be any of the above types)Iterable<*>
(*
values may be any of the above types)
-
To set a key's value use
`=`
like"key" `=` "value"
. -
There are two types of LuaScope with slight differences.
-
The LuaGlobalScope
- This is the default scope and only allows
String
keys. - These values allow the use of ".local()" to prepend the key with "local" making it a local variable.
"myValue".local()
will outputlocal myValue
- This is the default scope and only allows
-
The LuaTableScope
- This scope allows
String
,Number
,Boolean
, andDate
keys. - These values can not use
.local()
because they are values in a table.
I don't know what is going on with the formatting in this table, I'm sorry, I tried! 🙃
- This scope allows
-
Kotlin | Lua Output |
---|---|
"myValue" `=` "value" |
myValue = "value" |
"myValue".local() `=` "value" |
local myValue = "value" |
"myModule" `=` require("ModuleName") |
myModule = require("ModuleName") |
+"print('This code is just added as is to the Lua script')" |
print('This code is just added as is to the Lua script') |
"myTable" `=` {
"myKey" `=` "myValue"
48 `=` Date()
Date() `=` "myValue"
1.0 `=` 1
true `=` "myTrueValue"
"something" `=` true
"myListInLua" `=` listOf("a", "b", "c")
"myMapInLua" `=` mapOf("a" to "b", "c" to "d")
} Inside the brackets is LuaTableScope which allows values other than |
myTable = {
["myKey"] = "myValue",
[48] = "2022-12-21 17:33:09",
["2022-12-21 17:33:09"] = "myValue",
[1.0] = 1,
[true] = "myTrueValue",
["something"] = true,
["myListInLua"] = {"a", "b", "c"},
["myMapInLua"] = {
["a"] = "b",
["c"] = "d"
}
} |
-
Versioned Map (VersionedMap.kt)
- The best way to obtain this is by calling
.toVersionedMap()
on aJsonObject
received from the wiki.val versionedMap = jsonObject.toVersionedMap()
- The
VersionedMap
will create aTemplatePropertyData
for each key:
data class TemplatePropertyData(
val name: String,
val key: String,
val isWikiKey: Boolean,
val version: Int,
val value: String
)
- Example Template Data:
{
"id1" : 111,
"id2" : 222,
"id3" : 333
}
- Would create these property data classes:
TemplatePropertyData(name="id1", key="id", isWikiKey=true, version=1, value="111")
TemplatePropertyData(name="id2", key="id", isWikiKey=true, version=2, value="222")
TemplatePropertyData(name="id3", key="id", isWikiKey=true, version=3, value="333")
- You can check how many versions a template has with
versionedMap.versions
- By default, getting a property without the version will return
Version 0
. Version 0
is all values combined, or in a single versioned property, the value itself.- You can also use the original key if you know it and are expecting it.
id3
will work the same as["id", 3]
- If a template has multiple versions, some values may be the same across all versions, and will not have a versioned key.
- So if a version of a key is requested that does not exist, it will return the first or only value available.
- You can get a full map of a specific version, or a list containing a map for each individual version.
val versionCount = versionedMap.versions // 3
val id = versionedMap["id"] // "111, 222, 333"
val id1 = versionedMap["id", 1] // "111"
val id2 = versionedMap["id", 2] // "222"
val id5 = versionedMap["id", 5] // "111"
val version2 = versionedMap.getVersion(2) // Map<String, String>
val allVersions = versionedMap.getIndividualVersions() // List<Map<String, String>>
-
TitleQueue (TitleQueue.kt)
-
If the response is too long the Wiki will return an error, if this happens you may need to lower the chunk size.
-
Create a new queue with the list of titles and the chunk size. (The default size is 100)
-
val titles = wiki.getAll val queue = TitleQueue(titles, 50)
-
-
Then call
queue.execute { /* Your code here */ }
to execute the queue.-
The block inside the execute function is suspending.
-
The parameter passed to the block is a list of titles to be processed.
-
The block should only return titles that failed to be processed and will be re-added to the queue.
-
val processedResults = mutableMapOf<String, String>() queue.execute { titlesChunk -> // Process the titles here adding any data to your results, and returning any failed titles. // No data is returned from execute. }
-
-
Scribunto
-
Dynamic Page List (DPL)
-
Semantic Scribunto