|
| 1 | +Recently I found myself deep inside the Apple's MusicKitJS production code to isolate user authentication flow for Apple Music. |
| 2 | + |
| 3 | +## Background |
| 4 | + |
| 5 | +Over the past few months, I've made [MoovinGroovin](https://www.moovingroovin.com/), a web service that creates playlists from the songs you listened when working out with Strava turned on. |
| 6 | + |
| 7 | +MoovinGroovin is integrated with Spotify, and I got a request from a user to add support for Apple Music. |
| 8 | + |
| 9 | +As I looked into the integration with Apple Music, I found that to access user's listening history, I needed a "Music User Token". This is an authentication token generated from an OAuth flow. Unfortunately, the only public way to generate these is through `authenticate()` method of Apple's MusicKitJS SDK. |
| 10 | + |
| 11 | +This meant I would have to handle authentication with Apple Music on frontend, while all other integrations were handled by backend using passportJS. |
| 12 | + |
| 13 | +And so, I decided to extract the auth flow out of MusicKitJS, and wrap it into a separate passportJS strategy (apple-music-passport). |
| 14 | + |
| 15 | +This is where the journey begins... |
| 16 | + |
| 17 | +## TL;DR: |
| 18 | + |
| 19 | +1. Use beautifiers to clean up minified code. |
| 20 | +2. Understand how minifiers compress the execution (control) flow into `&&`, `||`, `,`, `;`, and `(x = y)` |
| 21 | +3. Recognize async constructs |
| 22 | +4. Recognize class constructs |
| 23 | +5. Use VSCode's `rename symbol` to rename variables without affecting other variables with the same name. |
| 24 | +6. Use property names or class methods to understand the context. |
| 25 | +7. Use VSCode's type inference to understand the context. |
| 26 | + |
| 27 | +## 1. Use beautifiers to clean up minified code. |
| 28 | + |
| 29 | +There's plenty of these tools, just google for a beautifier / prettifier / deminifier / unminifier and you will find them. [Beautify](https://marketplace.visualstudio.com/items?itemName=HookyQR.beautify) and [Prettier](https://marketplace.visualstudio.com/items?itemName=esbenp.prettier-vscode) VSCode extensions work just as well. |
| 30 | + |
| 31 | +Most of these are not very powerful. They will add whitespace, but that's it. You will still need to deal with statements chained with `,`, compressed control flow by `&&` or `||`, ugly classes and asyncs, and cryptic variable names. But you will quickly learn that - unless you're dealing with event-driven flow - you can just stick with where the debugger takes you and ignore most of the cryptic code. |
| 32 | + |
| 33 | +There was one tool (can't find it) which attempted assigning human-readable names to the minified variables. At first this _seemed_ cool, the truth is this will easily mislead you if the random names make somewhat sense. Instead, rolling with the minified variable names and renaming what _YOU_ understand is the way to go. |
| 34 | + |
| 35 | +## 2. Understand how minifiers compress the execution (control) flow into `&&` , `||` , `,` , `;` , and `(x = y)` |
| 36 | + |
| 37 | +As said above, you will still need to deal with cryptic statements like this:<br> |
| 38 | + |
| 39 | +``` |
| 40 | +void 0 === r && (r = ""), void 0 === i && (i = 14), void 0 === n && (n = window); |
| 41 | +``` |
| 42 | + |
| 43 | +Let's break it down: |
| 44 | + |
| 45 | +### `void 0` as `undefined` |
| 46 | + |
| 47 | +`void 0` is `undefined`. So this checks if `undefined === r`. Simple as that. |
| 48 | + |
| 49 | +### Inlined assignment `(x = y)` |
| 50 | + |
| 51 | +This assigns the value (`""`) to the variable (`r`) and **returns the assigned value**. Be conscious of this especially when you find it inside a boolean evaluation (`&&` or `||`). |
| 52 | + |
| 53 | +Consider example below, only the second line will be printed:<br> |
| 54 | + |
| 55 | +``` |
| 56 | +(r = "") && console.log('will not print'); |
| 57 | +(r = "abc") && console.log('will print'); |
| 58 | +``` |
| 59 | + |
| 60 | +Logically, this will be evaluated as:<br> |
| 61 | + |
| 62 | +``` |
| 63 | +"" && console.log('will not print'); |
| 64 | +"abc" && console.log('will print'); |
| 65 | +``` |
| 66 | + |
| 67 | +Which is:<br> |
| 68 | + |
| 69 | +``` |
| 70 | +false && console.log('will not print'); |
| 71 | +true && console.log('will print'); |
| 72 | +``` |
| 73 | + |
| 74 | +So while the second line will print, the **first one will not**. |
| 75 | + |
| 76 | +### Conditional execution with `&&` and `||` |
| 77 | + |
| 78 | +The code above used `&&` to execute the `console.log`. |
| 79 | + |
| 80 | +Remember that JS supports [short-circuit_evaluation](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Logical_AND#short-circuit_evaluation). This means that right hand side of<br> |
| 81 | + |
| 82 | +``` |
| 83 | +abc && console.log('will print'); |
| 84 | +``` |
| 85 | + |
| 86 | +will ever be executed _if and only if_ abc is _truthy_. |
| 87 | + |
| 88 | +In other words, if we have<br> |
| 89 | + |
| 90 | +``` |
| 91 | +false && console.log('will not print'); |
| 92 | +true && console.log('will print'); |
| 93 | +``` |
| 94 | + |
| 95 | +Then `console.log('will not print');` will never be reached. |
| 96 | + |
| 97 | +And same, but opposite, applies to `||`:<br> |
| 98 | + |
| 99 | +``` |
| 100 | +false || console.log('will print'); |
| 101 | +true || console.log('will not print'); |
| 102 | +``` |
| 103 | + |
| 104 | +What does this mean for us when reverse-engineering minified JS code? Often, you can substitute<br> |
| 105 | + |
| 106 | +``` |
| 107 | +abc && console.log('hello'); |
| 108 | +``` |
| 109 | + |
| 110 | +with more-readable<br> |
| 111 | + |
| 112 | +``` |
| 113 | +if (abc) { console.log('hello'); |
| 114 | +} |
| 115 | +``` |
| 116 | + |
| 117 | +One more thing here - be aware of the [operator precedence](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Logical_AND#operator_precedence). |
| 118 | + |
| 119 | +### Comma operator |
| 120 | + |
| 121 | +So far, we understand that<br> |
| 122 | + |
| 123 | +Really means<br> |
| 124 | + |
| 125 | +``` |
| 126 | +if (undefined === r) { r = ""; |
| 127 | +} |
| 128 | +``` |
| 129 | + |
| 130 | +We see, though, that in the original code, it's actually followed by a _comma_:<br> |
| 131 | + |
| 132 | +``` |
| 133 | +void 0 === r && (r = ""), void 0 === i && (i = 14), void 0 === n && (n = window); |
| 134 | +``` |
| 135 | + |
| 136 | +This is the [comma operator](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Comma_Operator). |
| 137 | + |
| 138 | +For our reverse-engineering purposes, it just means that **each statement** (separated by comma) **will be evaluated** and **the value of last statement will be returned**. |
| 139 | + |
| 140 | +In other words, think of a chain of comma statements as a mini-function. And so, we can think the code above as:<br> |
| 141 | + |
| 142 | +``` |
| 143 | +(function() { void 0 === r && (r = ""); void 0 === i && (i = 14); return void 0 === n && (n = window); |
| 144 | +})(); |
| 145 | +``` |
| 146 | + |
| 147 | +Overall, we can now read<br> |
| 148 | + |
| 149 | +``` |
| 150 | +void 0 === r && (r = ""), void 0 === i && (i = 14), void 0 === n && (n = window); |
| 151 | +``` |
| 152 | + |
| 153 | +as<br> |
| 154 | + |
| 155 | +``` |
| 156 | +(function() { if (r === undefined) { r = ""; } if (i === undefined) { i = 14; } if (n === undefined) { n = window; return n; } else { return false; } |
| 157 | +})(); |
| 158 | +``` |
| 159 | + |
| 160 | +## 3. Recognize async constructs |
| 161 | + |
| 162 | +Depending on the kind of code that you reverse-engineer, you may come into contact with async-heavy codebase. MusicKitJS was an example of this, as it handled requests to Apple Music API, so all methods that made requests were `async`. |
| 163 | + |
| 164 | +You may find the async functions transpiled into an `awaiter` and `generator` functions. Example:<br> |
| 165 | + |
| 166 | +``` |
| 167 | +API.prototype.recommendations = function (e, t) { return __awaiter(this, void 0, void 0, function () { var r; return __generator(this, function (i) { switch (i.label) { case 0: return [4, this.collection(et.Personalized, "recommendations", e, t)]; case 1: r = i.sent(), this._reindexRelationships(r, "recommendations"); try { return [2, this._store.parse(r)] } catch (e) { return [2, Promise.reject(MKError.parseError(e))] } } }) }) |
| 168 | +} |
| 169 | +``` |
| 170 | + |
| 171 | +Sometimes the `__awaiter` and `__generator` names might not be there, and you will just see this pattern:<br> |
| 172 | + |
| 173 | +``` |
| 174 | +return a(this, void 0, void 0, function () { return __generator(this, function (i) { switch (i.label) { case 0: return ... case 1: return ... ... } }) |
| 175 | +}) |
| 176 | +``` |
| 177 | + |
| 178 | +Either way, these are `async/await` constructs from TypeScript. You can read more about them in [this helpful post by Josh Goldberg](https://medium.com/@joshuakgoldberg/hacking-typescripts-async-await-awaiter-for-jquery-2-s-promises-60612e293c4b). |
| 179 | + |
| 180 | +The important part here is that if we have some like this:<br> |
| 181 | + |
| 182 | +``` |
| 183 | +return a(this, void 0, void 0, function () { return __generator(this, function (i) { switch (i.label) { case 0: /* ABC */ return [2, /* DEF */] case 1: /* GHI */ return [3, /* JKL */] ... } }) |
| 184 | +}) |
| 185 | +``` |
| 186 | + |
| 187 | +We can read most of the body inside `case N` as a regular code, and the second value of returned arrays (e.g. `/* DEF */`) as the awaited code. |
| 188 | + |
| 189 | +In other words, the above would translated to<br> |
| 190 | + |
| 191 | +``` |
| 192 | +(async function(){ /* ABC */; await /* DEF */; /* GHI */; await /* JKL */; |
| 193 | +})() |
| 194 | +``` |
| 195 | + |
| 196 | +## 4. Recognize class constructs |
| 197 | + |
| 198 | +Similarly to the previous point, depending on the underlying codebase, you may come across a lot of class definitions. |
| 199 | + |
| 200 | +Consider this example<br> |
| 201 | + |
| 202 | +``` |
| 203 | +API = function (e) { function API(t, r, i, n, o, a) { var s = e.call(this, t, r, n, a) || this; return s.storefrontId = je.ID, s.enablePlayEquivalencies = !!globalConfig.features.equivalencies, s.resourceRelatives = { artists: { albums: { include: "tracks" }, playlists: { include: "tracks" }, songs: null } }, s._store = new LocalDataStore, i && (s.storefrontId = i), n && o && (s.userStorefrontId = o), s.library = new Library(t, r, n), s } return __extends(API, e), Object.defineProperty(API.prototype, "needsEquivalents", { get: function () { return this.userStorefrontId && this.userStorefrontId !== this.storefrontId }, enumerable: !0, configurable: !0 }), API.prototype.activity = function (e, t) { return __awaiter(this, void 0, void 0, function () { return __generator(this, function (r) { return [2, this.resource(et.Catalog, "activities", e, t)] }) }) } |
| 204 | +``` |
| 205 | + |
| 206 | +Quite packed, isn't it? If you're familiar with the older syntax for class definition, it might not be anything new. Either way, let's break it down: |
| 207 | + |
| 208 | +### Constructor as `function(...) {...}` |
| 209 | + |
| 210 | +Constructor is the function that is called to construct the instance object. |
| 211 | + |
| 212 | +You will find these defined as plain functions (but always with `function` keyword). |
| 213 | + |
| 214 | +In the above, this is the<br> |
| 215 | + |
| 216 | +``` |
| 217 | +function API(t, r, i, n, o, a) { var s = e.call(this, t, r, n, a) || this; return s.storefrontId = je.ID, s.enablePlayEquivalencies = !!globalConfig.features.equivalencies, s.resourceRelatives = { artists: { albums: { include: "tracks" }, playlists: { include: "tracks" }, songs: null } }, s._store = new LocalDataStore, i && (s.storefrontId = i), n && o && (s.userStorefrontId = o), s.library = new Library(t, r, n), s |
| 218 | +} |
| 219 | +``` |
| 220 | + |
| 221 | +which we can read as<br> |
| 222 | + |
| 223 | +``` |
| 224 | +class API { constructor(t, r, i, n, o, a) { ... } |
| 225 | +} |
| 226 | +``` |
| 227 | + |
| 228 | +### Inheritance with `__extends` and `x.call(this, ...) || this;` |
| 229 | + |
| 230 | +Similarly to `__awaiter` and `__generator`, also `__extends` is a [TypeScript helper function](https://stackoverflow.com/questions/45954157/understanding-the-extends-function-generated-by-typescript). And similarly, the variable name `__extends` might not be retained. |
| 231 | + |
| 232 | +However, when you see that: |
| 233 | + |
| 234 | +1) The constructor definition is nested inside another function with some arg<br> |
| 235 | + |
| 236 | +``` |
| 237 | +API = function (e // This is the parent class) { function API(t, r, i, n, o, a) { ... } ... |
| 238 | +} |
| 239 | +``` |
| 240 | + |
| 241 | +2) That that unknown arg is called inside the constructor<br> |
| 242 | + |
| 243 | +``` |
| 244 | +API = function (e // This is the parent class) { function API(t, r, i, n, o, a) { var s = e.call(this, t, r, n, a) || this; // This is same as `super(t, r, n, a)` ... } ... |
| 245 | +} |
| 246 | +``` |
| 247 | + |
| 248 | +3) That that same unknown arg is also passed to some function along with out class<br> |
| 249 | + |
| 250 | +``` |
| 251 | +return __extends(API, e) // This passes the prototype of `e` to `API` |
| 252 | +``` |
| 253 | + |
| 254 | +Then you can read that as<br> |
| 255 | + |
| 256 | +``` |
| 257 | +class API extends e { constructor(t, r, i, n, o, a) { super(t, r, n, a); ... } |
| 258 | +} |
| 259 | +``` |
| 260 | + |
| 261 | +### Class methods and props with `x.prototype.xyz = {...}` or `Object.defineProperty(x.prototype, 'xyz', {...}` |
| 262 | + |
| 263 | +These are self-explanatory, but let's go over them too. |
| 264 | + |
| 265 | +`Object.defineProperty` can be used to defined a [getter or setter methods](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Method_definitions):<br> |
| 266 | + |
| 267 | +``` |
| 268 | +Object.defineProperty(API.prototype, "needsEquivalents", { get: function () { return this.userStorefrontId && this.userStorefrontId !== this.storefrontId }, enumerable: !0, configurable: !0 }) |
| 269 | +``` |
| 270 | + |
| 271 | +is a getter method that can be read as<br> |
| 272 | + |
| 273 | +``` |
| 274 | +class API { get needsEquivalents() { return this.userStorefrontId && this.userStorefrontId !== this.storefrontId } |
| 275 | +} |
| 276 | +``` |
| 277 | + |
| 278 | +Similarly, assignments to the prototype can be plain properties or methods. And so<br> |
| 279 | + |
| 280 | +``` |
| 281 | +API.prototype.activity = function (e, t) { return __awaiter(this, void 0, void 0, function () { return __generator(this, function (r) { return [2, this.resource(et.Catalog, "activities", e, t)] }) }) } |
| 282 | +``` |
| 283 | + |
| 284 | +is the same as<br> |
| 285 | + |
| 286 | +``` |
| 287 | +class API { async activity(e, t) { return this.resource(et.Catalog, "activities", e, t); } |
| 288 | +} |
| 289 | +``` |
| 290 | + |
| 291 | +1. Use VSCode's `rename symbol` to rename variables without affecting other variables with the same name. |
| 292 | + |
| 293 | +When reverse-engineering minified JS code, it crucial you write comments and rename variables to "save" the knowledge you've learnt parsing through the code. |
| 294 | + |
| 295 | +When you read<br> |
| 296 | + |
| 297 | +and you realize "Aha, `r` is the username!" |
| 298 | + |
| 299 | +It is very tempting to rename _all_ instances of `r` to `username`. However, the variable `r` may be used also in different functions to mean different things. |
| 300 | + |
| 301 | +Consider this code, where `r` is used twice to mean two different things<br> |
| 302 | + |
| 303 | +``` |
| 304 | +DOMSupport.prototype._mutationDidOccur = function (e) { var t = this; e.forEach(function (e) { if ("attributes" === e.type) { // Here, r is a value of some attribute var r = t.elements[e.attributeName]; r && t.attach(e.target, r) } // Here, r is current index for (var i = function (r) { var i = e.addedNodes[r]; if (!i.id && !i.dataset) return "continue"; i.id && t.elements[i.id] && t.attach(i, t.elements[i.id]), t.identifiers.forEach(function (e) { i.getAttribute(e) && t.attach(i, t.elements[e]) }) }, n = 0; n <span class="o"><</span> e.addedNodes.length; ++n) i(n); |
| 305 | +... |
| 306 | +``` |
| 307 | + |
| 308 | +Identifying all `r`s that mean one thing would be mind-numbing. Luckily, VSCode has a `rename symbol` feature, which can identify which variables reference the one we care about, and rename only then: |
| 309 | + |
| 310 | +1. Right click on the variable<br> |
| 311 | +2. Set new name:<br> |
| 312 | +3. After:<br> |
| 313 | + |
| 314 | +## 6. Use property names or class methods to understand the context. |
| 315 | + |
| 316 | +Let's go back to the previous point where we had<br> |
| 317 | + |
| 318 | +``` |
| 319 | +var r = t.elements[e.attributeName]; |
| 320 | +``` |
| 321 | + |
| 322 | +When you are trying to figure out the code, you can see we have a quick win here. We don't know what `r` was originally, but we see that it is probably an attribute or an element, based on the properties that were accessed. |
| 323 | + |
| 324 | +If you rename these cryptic variables to human-readable formats as you go along, you will quickly build up an approximate understanding of what's going on. |
| 325 | + |
| 326 | +## 7. Use VSCode's type inference to understand the context. |
| 327 | + |
| 328 | +Similarly to point 6. we can use VSCode's type inference to help us deciphering the variable names. |
| 329 | + |
| 330 | +This is most applicable in case of classes, which have type of `typeof ClassName`. This tells us that that variable is the class constructor. It looks something like this: |
| 331 | + |
| 332 | +From the type hint above we know we can rename `xyz` to `DomSupport`<br> |
| 333 | + |
| 334 | +``` |
| 335 | +DomSupport = function () { function DOMSupport(e, t) { void 0 === e && (e = void 0), void 0 === t && (t = Si.classes); var r = this; ... |
| 336 | +``` |
| 337 | + |
| 338 | +## Conclusion |
| 339 | + |
| 340 | +That's all I had. These should take you long way. Do you know of other tips? Ping me or add them in the comments! |
0 commit comments