You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/06-features-and-components/04-The-Interactive-Librarian/01-README.md
+21-21
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ The Interactive Librarian was introduced in July 2021 as part of the migration o
6
6
7
7
### Context: the Interactives DCR migration
8
8
9
-
When DCR was acquiring the ability to render interactives, it was decided that the initial implementation would work as follows:
9
+
When DCR was acquiring the ability to render interactives, it was decided that the initial implementation would work as follows:
10
10
11
11
1. Any interactive published before a given date, switch date, would be rendered by frontend.
12
12
2. Interactives published after that date would be sent to DCR for rendering (unless wearing a opt-out tag).
@@ -23,33 +23,33 @@ The population of aws S3 with interactive content was performed on all past inte
23
23
24
24
In fact, at the time these lines are written, the "ideal" for us is DCR being able to render all past interactives eventually, in which case the Librarian will be removed from frontend since it would then not be used.
25
25
26
-
In order to perform te migration, Pascal introduced two [admin] routes
26
+
In order to perform te migration, Pascal introduced two [admin] routes
27
27
28
28
```
29
29
# Interactive Pressing
30
-
POST /interactive-librarian/live-presser/*path
31
-
POST /interactive-librarian/read-clean-write/*path
30
+
POST /interactive-librarian/live-presser/*path
31
+
POST /interactive-librarian/read-clean-write/*path
32
32
```
33
33
34
34
The reason why those were added to the [admin] app instead of, say, [applications], (where Pascal would have found more natural to add them) is because the [admin] app is the only app that has write access to S3.
35
35
36
36
### live-presser
37
37
38
-
The first route
38
+
The first route
39
39
40
40
```
41
-
POST /interactive-librarian/live-presser/*path
42
-
```
43
-
41
+
POST /interactive-librarian/live-presser/*path
42
+
```
43
+
44
44
calls `InteractiveLibrarian.pressLiveContents`, triggers the retrieval of a live document and stores it to S3 in the **aws-frontend-archives-original** bucket
curl -X POST "https://frontend.gutools.co.uk/interactive-librarian/live-presser/books/ng-interactive/2021/mar/05/this-months-best-paperbacks-michelle-obama-jan-morris-and-more"
65
65
```
66
66
67
67
### read-clean-write
68
68
69
-
The second route
69
+
The second route
70
70
71
-
```
72
-
POST /interactive-librarian/read-clean-write/*path
71
+
```
72
+
POST /interactive-librarian/read-clean-write/*path
73
73
```
74
74
75
75
performs the read of a previously stored document, its "cleaning" and stores the outcome to bucket **aws-frontend-archive**.
76
76
77
77
### Notes
78
78
79
-
A. In order for the two [admin] routes
79
+
A. In order for the two [admin] routes
80
80
81
81
```
82
82
# Interactive Pressing
83
-
POST /interactive-librarian/live-presser/*path
84
-
POST /interactive-librarian/read-clean-write/*path
83
+
POST /interactive-librarian/live-presser/*path
84
+
POST /interactive-librarian/read-clean-write/*path
85
85
```
86
86
87
-
to work, the **interactive-librarian-admin-routes** switch must be ON. Because calling those routes is not part of "normal operations" for frontend, the switch should be OFF, unless otherwise specified. It never expires.
87
+
to work, the **content-presser** switch must be ON. Because calling those routes is not part of "normal operations" for frontend, the switch should be OFF, unless otherwise specified. It never expires.
88
88
89
89
B. Since a route was introduced for capturing the live content and another one was introduced for the "cleaning" (meaning reading from **aws-frontend-archives-original**, cleaning and writing to **aws-frontend-archive**), the reader could wonder where the code that actually performed those batch operations is. Answer: it is not part of the scala code. Pascal wrote a Ruby script to perform them. For the first run (batch storing to **aws-frontend-archives-original**, see documentation folder **02-Batch-01**)
90
90
@@ -94,7 +94,7 @@ D. Note that the scripts given **02-Batch-01** and **03-Batch-02**, are not port
94
94
95
95
E. One question that was left out from the above discussion is "How did you find the Interactive URLs ?", alternatively "How did you build [03.interactive-urls.txt](./02-Batch-01/03.interactive-urls.txt) ?" That list is the outcome of calling CAPI. This is done by the script **01-interactive-urls**.
This highlights the following: The two routes in [admin] that perform Interactive Pressing, are unlikely to really change in the future, but this doesn't mean that the Librarian itself is finished. In fact the missing part of the librarian is really the cleaning function, which could not have been written when the Libraian was born because we didn't know at the time which cleaning would be required.
123
+
This highlights the following: The two routes in [admin] that perform Interactive Pressing, are unlikely to really change in the future, but this doesn't mean that the Librarian itself is finished. In fact the missing part of the librarian is really the cleaning function, which could not have been written when the Libraian was born because we didn't know at the time which cleaning would be required.
124
124
125
125
That function doesn't need to be written from scratch. There are cleaners dating back from R2 and R2 Pressing that can be reused or adapted. See code in [admin] / app / pagepresser
126
126
@@ -130,7 +130,7 @@ The future of the Librarian, at the very moment the first version of this docume
130
130
131
131
- Ideally, one day, we will find a way for DCR to render past interactives and get rid of the Librarian. (This might not happen actually...)
132
132
133
-
- The Interactive Pressing routes in [admin] will probably stay, the **interactive-librarian-admin-routes** switch should remain off in normal circumstances and be activated by dotcom engineers when calling the routes for good reasons (for instance a new batch cleaning by calling the `read-clean-write` route).
133
+
- The Interactive Pressing routes in [admin] will probably stay, the **content-presser** switch should remain off in normal circumstances and be activated by dotcom engineers when calling the routes for good reasons (for instance a new batch cleaning by calling the `read-clean-write` route).
134
134
135
135
- The team will figure out which kind of cleaning those past contents needs, update the `cleanOriginalDocument` function accordingly, and rerun the `read-clean-write` operation, for all past interactives. Note that an engineer will have to either adapt Pascal's scripts or write new ones in the programming language of their choice.
Copy file name to clipboardExpand all lines: docs/06-features-and-components/04-The-Interactive-Librarian/02-interactive-migration.md
+5-3
Original file line number
Diff line number
Diff line change
@@ -23,7 +23,7 @@ Below describes the migration process:
23
23
- For articles greater than 5 years old, we will press everything by default.
24
24
- For articles less than 5 years old, The Visuals team will provide a list of interactives that will not be pressed, everything else will be pressed.
25
25
26
-
If issues are found with a migrated interactive then we have the the option to fix in one of three different ways:
26
+
If issues are found with a migrated interactive then we have the option to fix in one of three different ways:
27
27
1. Pressing the piece
28
28
2. Visuals team fixing the piece
29
29
3. Dotcom adding support to the platform
@@ -53,7 +53,7 @@ For pressing a batch of interactives this is controlled using command line scrip
53
53
2. For every URL make a request to /interactive-librarian/live-presser/{path}
54
54
3. For every successfully pressed article make a request to /interactive-librarian/read-clean-write/{path}
55
55
56
-
To press a single interactive we can use the frontend admin tool. On the admin tool, select the new option ‘Press an interactive’, enter the full URL for the interactive, click ‘Press’ and wait for the response. If there’s an error in the response it’ll need to be reported to the dotcom team.
56
+
To press a single interactive we can use the frontend admin tool. On the admin tool, select the new option ‘Press an article / interactive’, enter the full URL, click ‘Press’ and wait for the response. If there’s an error in the response it’ll need to be reported to the dotcom team.
57
57
58
58
**How do we know a page is pressed?**
59
59
@@ -68,7 +68,7 @@ In the long term we’d like to mark pressed articles with a tracking tag (track
68
68
**How can we view a pressed page?**
69
69
To view a pressed page there are a couple of options:
70
70
- Get the document from S3 directly (aws-frontend-archive).
71
-
- Intermediate solution: add the interactive path to the frontend config (https://github.com/guardian/frontend/blob/dlawes/serve-pressed-interactives/common/app/services/dotcomrendering/PressedInteractives.scala#L11).
71
+
- Intermediate solution: add the interactive path to the frontend config (https://github.com/guardian/frontend/blob/main/common/app/model/pressedContent.scala).
72
72
- Long-term solution: add tag tracking/dcroptout to article.
73
73
74
74
**Can we opt-out of pressing and render via frontend or DCR?**
@@ -103,3 +103,5 @@ A potential solution for reverting a migrated interactive to its pre-DCR form:
103
103
- At this point,we also press every interactive (but not serve all this content to readers). Pressing the content would mean we save how the interactive renders via the existing platform.
104
104
- If an article is migrated to DCR but we're unhappy with how it appears, we could fall back to serving the pressed version.
105
105
106
+
## Articles Pressing
107
+
Migration of 100% articles to DCR has been completed. In order to press articles, the same mechanism as interactives was used.
0 commit comments