Skip to content

Commit 88075ba

Browse files
authored
feat: add methods for fetching transcript content directly
1 parent 28078d7 commit 88075ba

File tree

5 files changed

+321
-152
lines changed

5 files changed

+321
-152
lines changed

README.md

Lines changed: 34 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -81,8 +81,11 @@ for [finding specific transcripts](#find-transcripts) by language or by type (ma
8181
TranscriptList transcriptList = youtubeTranscriptApi.listTranscripts("videoId");
8282

8383
// Iterate over transcript list
84-
for(Transcript transcript : transcriptList) {
85-
System.out.println(transcript);
84+
for(
85+
Transcript transcript :transcriptList){
86+
System.out.
87+
88+
println(transcript);
8689
}
8790

8891
// Find transcript in specific language
@@ -290,48 +293,49 @@ Playlists and channels information is retrieved from
290293
the [YouTube V3 API](https://developers.google.com/youtube/v3/docs/),
291294
so you will need to provide API key for all methods.
292295

296+
All methods take a `TranscriptRequest` object as a parameter,
297+
which contains the following fields:
298+
299+
- `apiKey` - YouTube API key.
300+
- `stopOnError`(optional, defaults to `true`) - Whether to stop on the first error or continue. If true, the method will
301+
fail fast by throwing an error if one of the transcripts could not be retrieved,
302+
otherwise it will ignore failed transcripts.
303+
304+
- `cookies` (optional) - Path to [cookies.txt](#cookies) file.
305+
306+
All methods return a map which contains the video ID as a key and the corresponding result as a value.
307+
293308
```java
294309
// Create a new default PlaylistsTranscriptApi instance
295310
PlaylistsTranscriptApi playlistsTranscriptApi = TranscriptApiFactory.createDefaultPlaylistsApi();
296311

312+
//Create request object
313+
TranscriptRequest request = new TranscriptRequest("apiKey");
314+
297315
// Retrieve all available transcripts for a given playlist
298-
Map<String, TranscriptList> transcriptLists = playlistsTranscriptApi.listTranscriptsForPlaylist(
299-
"playlistId",
300-
"apiKey",
301-
true);
316+
Map<String, TranscriptList> transcriptLists = playlistsTranscriptApi.listTranscriptsForPlaylist("playlistId", request);
302317

303318
// Retrieve all available transcripts for a given channel
304-
Map<String, TranscriptList> transcriptLists = playlistsTranscriptApi.listTranscriptsForChannel(
305-
"channelName",
306-
"apiKey",
307-
true);
319+
Map<String, TranscriptList> transcriptLists = playlistsTranscriptApi.listTranscriptsForChannel("channelName", request);
308320
```
309321

310-
As you can see, there is also a boolean flag `continueOnError`, which tells whether to continue if transcript retrieval
311-
fails for a video or not. For example, if it's set to `true`, all transcripts that could not be retrieved will be
312-
skipped, if
313-
it's set to `false`, operation will fail fast on the first error.
314-
315-
All methods are also have overloaded versions which accept path to [cookies.txt](#cookies) file.
322+
Same as with the `YoutubeTranscriptApi`, you can also fetch transcript content directly
323+
using [fallback languages](#use-fallback-language) if needed.
316324

317325
```java
318-
// Retrieve all available transcripts for a given playlist
319-
Map<String, TranscriptList> transcriptLists = playlistsTranscriptApi.listTranscriptsForPlaylist(
320-
"playlistId",
321-
"apiKey",
322-
true,
323-
"path/to/cookies.txt"
324-
);
326+
//Create request object
327+
TranscriptRequest request = new TranscriptRequest("apiKey");
325328

326-
// Retrieve all available transcripts for a given channel
327-
Map<String, TranscriptList> transcriptLists = playlistsTranscriptApi.listTranscriptsForChannel(
328-
"channelName",
329-
"apiKey",
330-
true,
331-
"path/to/cookies.txt"
332-
);
329+
// Retrieve transcript content for all videos in a playlist
330+
Map<String, TranscriptContent> transcriptLists = playlistsTranscriptApi.getTranscriptsForPlaylist("playlistId", request);
331+
332+
// Retrieve transcript content for all videos in a channel
333+
Map<String, TranscriptContent> transcriptLists = playlistsTranscriptApi.getTranscriptsForChannel("channelName", request, "en, de");
333334
```
334335

336+
> **Note:** If you want to get transcript content in a different format, refer
337+
> to [Use Formatters](#use-formatters).
338+
335339
## 🤓 How it works
336340

337341
Within each YouTube video page, there exists JSON data containing all the transcript information, including an

lib/src/main/java/io/github/thoroldvix/api/PlaylistsTranscriptApi.java

Lines changed: 39 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,11 @@
88
* Retrieves transcripts for all videos in a playlist, or all videos for a specific channel.
99
* <p>
1010
* Playlists and channel videos are retrieved from the YouTube API, so you will need to have a valid api key to use this.
11+
* <p>
12+
* All methods take a {@link TranscriptRequest} object as a parameter, which contains API key, cookies file path (optional), and stop on error flag (optional, defaults to true).
13+
* If cookies are not provided, the API will not be able to access age restricted videos, see <a href="https://github.com/Thoroldvix/youtube-transcript-api#cookies">Cookies</a>.
14+
* <p>
15+
* {@link TranscriptRequest} also contains a flag to stop on error, or continue on error.
1116
* </p>
1217
* <p>
1318
* To get implementation for this interface see {@link TranscriptApiFactory}
@@ -16,56 +21,59 @@
1621
public interface PlaylistsTranscriptApi {
1722

1823
/**
19-
* Retrieves transcript lists for all videos in the specified playlist using provided API key and cookies file from a specified path.
24+
* Retrieves transcript lists for all videos in the specified playlist.
2025
*
21-
* @param playlistId The ID of the playlist
22-
* @param apiKey API key for the YouTube V3 API (see <a href="https://developers.google.com/youtube/v3/getting-started">Getting started</a>)
23-
* @param continueOnError Whether to continue if transcript retrieval fails for a video. If true, all transcripts that could not be retrieved will be skipped,
24-
* otherwise an exception will be thrown.
25-
* @param cookiesPath The file path to the text file containing the authentication cookies. Used in the case if some videos are age restricted see {<a href="https://github.com/Thoroldvix/youtube-transcript-api#cookies">Cookies</a>}
26+
* @param playlistId The ID of the playlist
27+
* @param request {@link TranscriptRequest} request object containing API key, cookies file path, and stop on error flag
2628
* @return A map of video IDs to {@link TranscriptList} objects
2729
* @throws TranscriptRetrievalException If the retrieval of the transcript lists fails
2830
*/
29-
Map<String, TranscriptList> listTranscriptsForPlaylist(String playlistId, String apiKey, String cookiesPath, boolean continueOnError) throws TranscriptRetrievalException;
31+
Map<String, TranscriptList> listTranscriptsForPlaylist(String playlistId, TranscriptRequest request) throws TranscriptRetrievalException;
3032

3133

3234
/**
33-
* Retrieves transcript lists for all videos in the specified playlist using provided API key.
35+
* Retrieves transcript lists for all videos for the specified channel.
3436
*
35-
* @param playlistId The ID of the playlist
36-
* @param apiKey API key for the YouTube V3 API (see <a href="https://developers.google.com/youtube/v3/getting-started">Getting started</a>)
37-
* @param continueOnError Whether to continue if transcript retrieval fails for a video. If true, all transcripts that could not be retrieved will be skipped,
38-
* otherwise an exception will be thrown.
37+
* @param channelName The name of the channel
38+
* @param request {@link TranscriptRequest} request object containing API key, cookies file path, and stop on error flag
3939
* @return A map of video IDs to {@link TranscriptList} objects
4040
* @throws TranscriptRetrievalException If the retrieval of the transcript lists fails
4141
*/
42-
Map<String, TranscriptList> listTranscriptsForPlaylist(String playlistId, String apiKey, boolean continueOnError) throws TranscriptRetrievalException;
42+
Map<String, TranscriptList> listTranscriptsForChannel(String channelName, TranscriptRequest request) throws TranscriptRetrievalException;
4343

4444

4545
/**
46-
* Retrieves transcript lists for all videos for the specified channel using provided API key and cookies file from a specified path.
46+
* Retrieves transcript content for all videos in the specified playlist.
4747
*
48-
* @param channelName The name of the channel
49-
* @param apiKey API key for the YouTube V3 API (see <a href="https://developers.google.com/youtube/v3/getting-started">Getting started</a>)
50-
* @param cookiesPath The file path to the text file containing the authentication cookies. Used in the case if some videos are age restricted see {<a href="https://github.com/Thoroldvix/youtube-transcript-api#cookies">Cookies</a>}
51-
* @param continueOnError Whether to continue if transcript retrieval fails for a video. If true, all transcripts that could not be retrieved will be skipped,
52-
* otherwise an exception will be thrown.
53-
* @return A map of video IDs to {@link TranscriptList} objects
54-
* @throws TranscriptRetrievalException If the retrieval of the transcript lists fails
55-
* @throws TranscriptRetrievalException If the retrieval of the transcript lists fails
48+
* @param playlistId The ID of the playlist
49+
* @param request {@link TranscriptRequest} request object containing API key, cookies file path, and stop on error flag
50+
* @param languageCodes A varargs list of language codes in descending priority.
51+
* <p>
52+
* For example:
53+
* </p>
54+
* If this is set to {@code ("de", "en")}, it will first attempt to fetch the German transcript ("de"), and then fetch the English
55+
* transcript ("en") if the former fails. If no language code is provided, it uses English as the default language.
56+
* @return A map of video IDs to {@link TranscriptContent} objects
57+
* @throws TranscriptRetrievalException If the retrieval of the transcript fails
5658
*/
57-
Map<String, TranscriptList> listTranscriptsForChannel(String channelName, String apiKey, String cookiesPath, boolean continueOnError) throws TranscriptRetrievalException;
59+
Map<String, TranscriptContent> getTranscriptsForPlaylist(String playlistId,
60+
TranscriptRequest request,
61+
String... languageCodes) throws TranscriptRetrievalException;
5862

5963

6064
/**
61-
* Retrieves transcript lists for all videos for the specified channel using provided API key.
65+
* Retrieves transcript content for all videos for the specified channel.
6266
*
63-
* @param channelName The name of the channel
64-
* @param apiKey API key for the YouTube V3 API (see <a href="https://developers.google.com/youtube/v3/getting-started">Getting started</a>)
65-
* @param continueOnError Whether to continue if transcript retrieval fails for a video. If true, all transcripts that could not be retrieved will be skipped,
66-
* otherwise an exception will be thrown.
67-
* @return A map of video IDs to {@link TranscriptList} objects
68-
* @throws TranscriptRetrievalException If the retrieval of the transcript lists fails
67+
* @param channelName The name of the channel
68+
* @param request {@link TranscriptRequest} request object containing API key, cookies file path, and stop on error flag
69+
* @param languageCodes A varargs list of language codes in descending priority.
70+
* <p>
71+
* For example:
72+
* </p>
73+
* If this is set to {@code ("de", "en")}, it will first attempt to fetch the German transcript ("de"), and then fetch the English
74+
* transcript ("en") if the former fails. If no language code is provided, it uses English as the default language.
75+
* @return A map of video IDs to {@link TranscriptContent} objects
76+
* @throws TranscriptRetrievalException If the retrieval of the transcript fails
6977
*/
70-
Map<String, TranscriptList> listTranscriptsForChannel(String channelName, String apiKey, boolean continueOnError) throws TranscriptRetrievalException;
78+
Map<String, TranscriptContent> getTranscriptsForChannel(String channelName, TranscriptRequest request, String... languageCodes) throws TranscriptRetrievalException;
7179
}
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
package io.github.thoroldvix.api;
2+
3+
/**
4+
* Request object for retrieving transcripts from {@link PlaylistsTranscriptApi}.
5+
* <p>
6+
* Contains API key required for the YouTube V3 API,
7+
* and optionally a file path to the text file containing the authentication cookies. If cookies are not provided, the API will not be able to access age restricted videos.
8+
* Also contains a flag to stop on error, or continue on error. Defaults to false if not provided.
9+
* </p>
10+
* </p>
11+
*/
12+
public class TranscriptRequest {
13+
private final String apiKey;
14+
private final String cookiesPath;
15+
private final boolean stopOnError;
16+
17+
/**
18+
* Creates a new instance of {@link TranscriptRequest}
19+
*
20+
* @param apiKey API key for the YouTube V3 API (see <a href="https://developers.google.com/youtube/v3/getting-started">Getting started</a>)
21+
* @param cookiesPath The file path to the text file containing the authentication cookies. Used in the case if some videos are age restricted see {<a href="https://github.com/Thoroldvix/youtube-transcript-api#cookies">Cookies</a>}
22+
* @param stopOnError Whether to stop if transcript retrieval fails for a video. If false, all transcripts that could not be retrieved will be skipped,
23+
* * otherwise an exception will be thrown on first error.
24+
*/
25+
public TranscriptRequest(String apiKey, String cookiesPath, boolean stopOnError) {
26+
if (apiKey == null || apiKey.isBlank()) {
27+
throw new IllegalArgumentException("API key cannot be null or blank");
28+
}
29+
this.apiKey = apiKey;
30+
this.cookiesPath = cookiesPath;
31+
this.stopOnError = stopOnError;
32+
}
33+
34+
public TranscriptRequest(String apiKey, String cookiesPath) {
35+
this(apiKey, cookiesPath, true);
36+
}
37+
38+
public TranscriptRequest(String apiKey) {
39+
this(apiKey, null, true);
40+
}
41+
42+
public TranscriptRequest(String apiKey, boolean stopOnError) {
43+
this(apiKey, null, stopOnError);
44+
}
45+
46+
/**
47+
* @return API key for the YouTube V3 API (see <a href="https://developers.google.com/youtube/v3/getting-started">Getting started</a>)
48+
*/
49+
public String getApiKey() {
50+
return apiKey;
51+
}
52+
53+
/**
54+
* @return The file path to the text file containing the authentication cookies. Used in the case if some videos are age restricted see {<a href="https://github.com/Thoroldvix/youtube-transcript-api#cookies">Cookies</a>}
55+
*/
56+
public String getCookiesPath() {
57+
return cookiesPath;
58+
}
59+
60+
/**
61+
* @return Whether to stop if transcript retrieval fails for a video. If false, all transcripts that could not be retrieved will be skipped,
62+
* * otherwise an exception will be thrown on first error.
63+
*/
64+
public boolean isStopOnError() {
65+
return stopOnError;
66+
}
67+
}

0 commit comments

Comments
 (0)