Skip to content

feat: allow training multiple capture actions in one recording session #562

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 58 commits into from
Apr 30, 2025
Merged
Show file tree
Hide file tree
Changes from 56 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
7fe982a
feat: allow multiple capture action execution
RohitR311 Apr 27, 2025
c6266fd
feat: emit action type
RohitR311 Apr 27, 2025
cd4820f
feat: serialize data by action type
RohitR311 Apr 27, 2025
bd5087e
feat: run and abort categorize data by action type
RohitR311 Apr 27, 2025
35e7778
feat: schedule categorize data by action type
RohitR311 Apr 27, 2025
00ef3ba
feat: record api categorize data by action type
RohitR311 Apr 27, 2025
6243563
feat: track browser actions state
RohitR311 Apr 27, 2025
6376fd6
feat: reset interpret log list screenshot
RohitR311 Apr 28, 2025
b5c5ed7
feat: check action exists in workflow
RohitR311 Apr 28, 2025
42b56d7
feat: emit recording editor actions by type
RohitR311 Apr 28, 2025
82d6f70
feat: revamp output preview log ui
RohitR311 Apr 28, 2025
a00e69e
feat: change scrape schema merge logic
RohitR311 Apr 28, 2025
a7771cf
feat: memoize handle url change
RohitR311 Apr 28, 2025
f975862
feat: revamp run content ui
RohitR311 Apr 28, 2025
8d06146
feat: add translations for run content
RohitR311 Apr 28, 2025
4ed3160
feat: emit socket events for stopping
RohitR311 Apr 28, 2025
2ffbdc7
feat: revamp gsheet integration multiple actions
RohitR311 Apr 29, 2025
109afff
feat: revamp airtable integration multiple actions
RohitR311 Apr 29, 2025
882b25c
feat: correct key used when checking unconfirmed list fields
RohitR311 Apr 29, 2025
01ab958
feat: replace banned Function type with an explicit signature
RohitR311 Apr 29, 2025
c7e3a66
feat: maxlen 0 if field not exist
RohitR311 Apr 29, 2025
f1d0cbd
feat: rm other actions logic
RohitR311 Apr 30, 2025
b4e3ccd
Merge branch 'develop' into all-record
amhsirak Apr 30, 2025
f1c1488
fix: lint
amhsirak Apr 30, 2025
0c5e98c
fix: lint
amhsirak Apr 30, 2025
d0f284c
chore: -rm unused imports
amhsirak Apr 30, 2025
f65dda0
feat: -rm download all json
amhsirak Apr 30, 2025
302ec00
feat: -rm horizontal view
amhsirak Apr 30, 2025
e5f63be
feat: -rm vertical view
amhsirak Apr 30, 2025
9eeb367
feat: -rm box
amhsirak Apr 30, 2025
458392d
feat: -rm horizontal view from screenshots
amhsirak Apr 30, 2025
db890e0
feat: -rm vertical view from screenshots
amhsirak Apr 30, 2025
e6a7fdf
feat: -rm box
amhsirak Apr 30, 2025
2b04634
feat: -rm unused icons
amhsirak Apr 30, 2025
c202a50
feat: -rm icons for capture text and list
amhsirak Apr 30, 2025
cf5be61
feat: -rm icons
amhsirak Apr 30, 2025
47ed5ce
feat: -rm icons
amhsirak Apr 30, 2025
7c7116a
feat: -rm captured data
amhsirak Apr 30, 2025
1f06bcd
feat: -rm captured screenshots
amhsirak Apr 30, 2025
392a5fd
feat: rm workflow in progress logic
RohitR311 Apr 30, 2025
624d7fc
Merge branch 'all-record' of https://github.com/getmaxun/maxun into a…
RohitR311 Apr 30, 2025
9bc9815
feat: -rm chips
amhsirak Apr 30, 2025
94fecc1
feat: -rm download all json
amhsirak Apr 30, 2025
9c57824
feat: -rm download all json
amhsirak Apr 30, 2025
decf14a
fix: cleanup
amhsirak Apr 30, 2025
b72d0dc
chore: remove unused import
amhsirak Apr 30, 2025
43b7a7d
feat: rm view mode logic
RohitR311 Apr 30, 2025
02a150d
feat: paginate capture screenshots
RohitR311 Apr 30, 2025
b83fcb2
feat: rm left space
RohitR311 Apr 30, 2025
db25627
feat: change translations
RohitR311 Apr 30, 2025
5cd756c
feat: rm card componenent
RohitR311 Apr 30, 2025
3b618d8
feat: rm screenshot items chip
RohitR311 Apr 30, 2025
6419d31
feat: buttons ui change, rm render expand
RohitR311 Apr 30, 2025
daa9779
feat: rm unnecessray imports
RohitR311 Apr 30, 2025
eed8ff3
Merge pull request #574 from getmaxun/all-record-ui
RohitR311 Apr 30, 2025
7b6de7d
feat: fix download ss logic
RohitR311 Apr 30, 2025
f5df4c9
feat: rm other page and data
RohitR311 Apr 30, 2025
e09e794
feat: show tabs only if multiple actions
RohitR311 Apr 30, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 61 additions & 36 deletions maxun-core/src/interpret.ts
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,9 @@ interface InterpreterOptions {
binaryCallback: (output: any, mimeType: string) => (void | Promise<void>);
debug: boolean;
debugChannel: Partial<{
activeId: Function,
debugMessage: Function,
activeId: (id: number) => void,
debugMessage: (msg: string) => void,
setActionType: (type: string) => void,
}>
}

Expand Down Expand Up @@ -377,12 +378,20 @@ export default class Interpreter extends EventEmitter {
*/
const wawActions: Record<CustomFunctions, (...args: any[]) => void> = {
screenshot: async (params: PageScreenshotOptions) => {
if (this.options.debugChannel?.setActionType) {
this.options.debugChannel.setActionType('screenshot');
}

const screenshotBuffer = await page.screenshot({
...params, path: undefined,
});
await this.options.binaryCallback(screenshotBuffer, 'image/png');
},
enqueueLinks: async (selector: string) => {
if (this.options.debugChannel?.setActionType) {
this.options.debugChannel.setActionType('enqueueLinks');
}

const links: string[] = await page.locator(selector)
.evaluateAll(
// @ts-ignore
Expand All @@ -409,55 +418,51 @@ export default class Interpreter extends EventEmitter {
await page.close();
},
scrape: async (selector?: string) => {
if (this.options.debugChannel?.setActionType) {
this.options.debugChannel.setActionType('scrape');
}

await this.ensureScriptsLoaded(page);

const scrapeResults: Record<string, string>[] = await page.evaluate((s) => window.scrape(s ?? null), selector);
await this.options.serializableCallback(scrapeResults);
},

scrapeSchema: async (schema: Record<string, { selector: string; tag: string, attribute: string; shadow: string}>) => {
if (this.options.debugChannel?.setActionType) {
this.options.debugChannel.setActionType('scrapeSchema');
}

await this.ensureScriptsLoaded(page);

const scrapeResult = await page.evaluate((schemaObj) => window.scrapeSchema(schemaObj), schema);

const newResults = Array.isArray(scrapeResult) ? scrapeResult : [scrapeResult];
newResults.forEach((result) => {
Object.entries(result).forEach(([key, value]) => {
const keyExists = this.cumulativeResults.some(
(item) => key in item && item[key] !== undefined
);

if (!keyExists) {
this.cumulativeResults.push({ [key]: value });
}
});
if (!this.cumulativeResults || !Array.isArray(this.cumulativeResults)) {
this.cumulativeResults = [];
}

if (this.cumulativeResults.length === 0) {
this.cumulativeResults.push({});
}

const mergedResult = this.cumulativeResults[0];
const resultToProcess = Array.isArray(scrapeResult) ? scrapeResult[0] : scrapeResult;

Object.entries(resultToProcess).forEach(([key, value]) => {
if (value !== undefined) {
mergedResult[key] = value;
}
});

const mergedResult: Record<string, string>[] = [
Object.fromEntries(
Object.entries(
this.cumulativeResults.reduce((acc, curr) => {
Object.entries(curr).forEach(([key, value]) => {
// If the key doesn't exist or the current value is not undefined, add/update it
if (value !== undefined) {
acc[key] = value;
}
});
return acc;
}, {})
)
)
];

// Log cumulative results after each action
console.log("CUMULATIVE results:", this.cumulativeResults);
console.log("MERGED results:", mergedResult);

await this.options.serializableCallback(mergedResult);
// await this.options.serializableCallback(scrapeResult);

console.log("Updated merged result:", mergedResult);
await this.options.serializableCallback([mergedResult]);
},

scrapeList: async (config: { listSelector: string, fields: any, limit?: number, pagination: any }) => {
if (this.options.debugChannel?.setActionType) {
this.options.debugChannel.setActionType('scrapeList');
}

await this.ensureScriptsLoaded(page);
if (!config.pagination) {
const scrapeResults: Record<string, any>[] = await page.evaluate((cfg) => window.scrapeList(cfg), config);
Expand All @@ -469,6 +474,10 @@ export default class Interpreter extends EventEmitter {
},

scrapeListAuto: async (config: { listSelector: string }) => {
if (this.options.debugChannel?.setActionType) {
this.options.debugChannel.setActionType('scrapeListAuto');
}

await this.ensureScriptsLoaded(page);

const scrapeResults: { selector: string, innerText: string }[] = await page.evaluate((listSelector) => {
Expand All @@ -479,6 +488,10 @@ export default class Interpreter extends EventEmitter {
},

scroll: async (pages?: number) => {
if (this.options.debugChannel?.setActionType) {
this.options.debugChannel.setActionType('scroll');
}

await page.evaluate(async (pagesInternal) => {
for (let i = 1; i <= (pagesInternal ?? 1); i += 1) {
// @ts-ignore
Expand All @@ -488,6 +501,10 @@ export default class Interpreter extends EventEmitter {
},

script: async (code: string) => {
if (this.options.debugChannel?.setActionType) {
this.options.debugChannel.setActionType('script');
}

const AsyncFunction: FunctionConstructor = Object.getPrototypeOf(
async () => { },
).constructor;
Expand All @@ -496,6 +513,10 @@ export default class Interpreter extends EventEmitter {
},

flag: async () => new Promise((res) => {
if (this.options.debugChannel?.setActionType) {
this.options.debugChannel.setActionType('flag');
}

this.emit('flag', page, res);
}),
};
Expand Down Expand Up @@ -526,6 +547,10 @@ export default class Interpreter extends EventEmitter {
const params = !step.args || Array.isArray(step.args) ? step.args : [step.args];
await wawActions[step.action as CustomFunctions](...(params ?? []));
} else {
if (this.options.debugChannel?.setActionType) {
this.options.debugChannel.setActionType(String(step.action));
}

// Implements the dot notation for the "method name" in the workflow
const levels = String(step.action).split('.');
const methodName = levels[levels.length - 1];
Expand Down
1 change: 0 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,6 @@
"lodash": "^4.17.21",
"loglevel": "^1.8.0",
"loglevel-plugin-remote": "^0.6.8",
"maxun-core": "^0.0.15",
"minio": "^8.0.1",
"moment-timezone": "^0.5.45",
"node-cron": "^3.0.3",
Expand Down
23 changes: 13 additions & 10 deletions public/locales/de.json
Original file line number Diff line number Diff line change
Expand Up @@ -535,20 +535,23 @@
"output_data": "Ausgabedaten",
"log": "Protokoll"
},
"empty_output": "Die Ausgabe ist leer.",
"loading": "Ausführung läuft. Extrahierte Daten werden nach Abschluss des Durchlaufs hier angezeigt.",
"buttons": {
"stop": "Stoppen"
},
"loading": "Daten werden geladen...",
"empty_output": "Keine Ausgabedaten verfügbar",
"captured_data": {
"title": "Erfasste Daten",
"download_json": "Als JSON herunterladen",
"download_csv": "Als CSV herunterladen"
"download_csv": "CSV herunterladen",
"view_full": "Vollständige Daten anzeigen",
"items": "Elemente",
"schema_title": "Erfasste Texte",
"list_title": "Erfasste Listen"
},
"captured_screenshot": {
"title": "Erfasster Screenshot",
"download": "Screenshot herunterladen",
"render_failed": "Das Bild konnte nicht gerendert werden"
},
"buttons": {
"stop": "Stoppen"
"title": "Erfasste Screenshots",
"download": "Herunterladen",
"render_failed": "Fehler beim Rendern des Screenshots"
}
},
"navbar": {
Expand Down
28 changes: 18 additions & 10 deletions public/locales/en.json
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,11 @@
"pagination": "Select how the robot can capture the rest of the list",
"limit": "Choose the number of items to extract",
"complete": "Capture is complete"
},
"actions": {
"text": "Capture Text",
"list": "Capture List",
"screenshot": "Capture Screenshot"
}
},
"right_panel": {
Expand Down Expand Up @@ -543,20 +548,23 @@
"output_data": "Output Data",
"log": "Log"
},
"empty_output": "The output is empty.",
"loading": "Run in progress. Extracted data will appear here once run completes.",
"buttons": {
"stop": "Stop"
},
"loading": "Loading data...",
"empty_output": "No output data available",
"captured_data": {
"title": "Captured Data",
"download_json": "Download as JSON",
"download_csv": "Download as CSV"
"download_csv": "Download CSV",
"view_full": "View Full Data",
"items": "items",
"schema_title": "Captured Texts",
"list_title": "Captured Lists"
},
"captured_screenshot": {
"title": "Captured Screenshot",
"download": "Download Screenshot",
"render_failed": "The image failed to render"
},
"buttons": {
"stop": "Stop"
"title": "Captured Screenshots",
"download": "Download",
"render_failed": "Failed to render screenshot"
}
},
"navbar": {
Expand Down
25 changes: 14 additions & 11 deletions public/locales/es.json
Original file line number Diff line number Diff line change
Expand Up @@ -536,20 +536,23 @@
"output_data": "Datos de Salida",
"log": "Registro"
},
"empty_output": "La salida está vacía.",
"loading": "Ejecución en curso. Los datos extraídos aparecerán aquí una vez que se complete la ejecución.",
"buttons": {
"stop": "Detener"
},
"loading": "Cargando datos...",
"empty_output": "No hay datos de salida disponibles",
"captured_data": {
"title": "Datos Capturados",
"download_json": "Descargar como JSON",
"download_csv": "Descargar como CSV"
"title": "Datos capturados",
"download_csv": "Descargar CSV",
"view_full": "Ver datos completos",
"items": "elementos",
"schema_title": "Textos capturados",
"list_title": "Listas capturadas"
},
"captured_screenshot": {
"title": "Captura de Pantalla",
"download": "Descargar Captura",
"render_failed": "No se pudo renderizar la imagen"
},
"buttons": {
"stop": "Detener"
"title": "Capturas de pantalla",
"download": "Descargar",
"render_failed": "Error al renderizar la captura de pantalla"
}
},
"navbar": {
Expand Down
25 changes: 14 additions & 11 deletions public/locales/ja.json
Original file line number Diff line number Diff line change
Expand Up @@ -536,20 +536,23 @@
"output_data": "出力データ",
"log": "ログ"
},
"empty_output": "出力は空です。",
"loading": "実行中です。実行が完了すると、抽出されたデータがここに表示されます。",
"buttons": {
"stop": "停止"
},
"loading": "データを読み込み中...",
"empty_output": "出力データがありません",
"captured_data": {
"title": "キャプチャされたデータ",
"download_json": "JSONとしてダウンロード",
"download_csv": "CSVとしてダウンロード"
"title": "キャプチャしたデータ",
"download_csv": "CSVをダウンロード",
"view_full": "完全なデータを表示",
"items": "アイテム",
"schema_title": "キャプチャしたテキスト",
"list_title": "キャプチャしたリスト"
},
"captured_screenshot": {
"title": "キャプチャされたスクリーンショット",
"download": "スクリーンショットをダウンロード",
"render_failed": "画像のレンダリングに失敗しました"
},
"buttons": {
"stop": "停止"
"title": "キャプチャしたスクリーンショット",
"download": "ダウンロード",
"render_failed": "スクリーンショットのレンダリングに失敗しました"
}
},
"navbar": {
Expand Down
25 changes: 14 additions & 11 deletions public/locales/zh.json
Original file line number Diff line number Diff line change
Expand Up @@ -536,20 +536,23 @@
"output_data": "输出数据",
"log": "日志"
},
"empty_output": "输出为空。",
"loading": "运行中。运行完成后,提取的数据将显示在此处。",
"buttons": {
"stop": "停止"
},
"loading": "加载数据中...",
"empty_output": "没有可用的输出数据",
"captured_data": {
"title": "捕获的数据",
"download_json": "下载为JSON",
"download_csv": "下载为CSV"
"title": "已捕获的数据",
"download_csv": "下载CSV",
"view_full": "查看完整数据",
"items": "项目",
"schema_title": "已捕获的文本",
"list_title": "已捕获的列表"
},
"captured_screenshot": {
"title": "捕获的截图",
"download": "下载截图",
"render_failed": "图像渲染失败"
},
"buttons": {
"stop": "停止"
"title": "已捕获的截图",
"download": "下载",
"render_failed": "渲染截图失败"
}
},
"navbar": {
Expand Down
10 changes: 9 additions & 1 deletion server/src/api/record.ts
Original file line number Diff line number Diff line change
Expand Up @@ -586,6 +586,11 @@ async function executeRun(id: string, userId: string) {
const binaryOutputService = new BinaryOutputService('maxun-run-screenshots');
const uploadedBinaryOutput = await binaryOutputService.uploadAndStoreBinaryOutput(run, interpretationInfo.binaryOutput);

const categorizedOutput = {
scrapeSchema: interpretationInfo.scrapeSchemaOutput || {},
scrapeList: interpretationInfo.scrapeListOutput || {},
};

await destroyRemoteBrowser(plainRun.browserId, userId);

const updatedRun = await run.update({
Expand All @@ -594,7 +599,10 @@ async function executeRun(id: string, userId: string) {
finishedAt: new Date().toLocaleString(),
browserId: plainRun.browserId,
log: interpretationInfo.log.join('\n'),
serializableOutput: interpretationInfo.serializableOutput,
serializableOutput: {
scrapeSchema: Object.values(categorizedOutput.scrapeSchema),
scrapeList: Object.values(categorizedOutput.scrapeList),
},
binaryOutput: uploadedBinaryOutput,
});

Expand Down
Loading