Train GraphicClick on the map to explore geographics
 
I need help
FAQ
Emergency
About .
Travel & transport from BBC stories as at 17:15 30 Dec 2024
 
- Replacement 'green' ferry emits more CO2 than old diesel ship
- What we know so far
- The driver who 'jumped' his bus over the Tower Bridge gap
- Avanti West Coast strike to hit New Year's Eve trains
- Gatwick flights returning to normal after fog
Read about the forum [here].
Register [here] - it's free.
What do I gain from registering? [here]
 01/01/25 - Railway 200 'Whistle Up' UK
09/01/25 - Bath Railway Society
24/01/25 - Westbury Station reopens
24/01/25 - LTP4 Wilts / Consultation end

On this day
30th Dec (1956)
Liverpool Overhead Railway closed (link)

Train RunningDelayed
16:48 London Paddington to Swansea
17:03 London Paddington to Penzance
17:30 London Paddington to Taunton
Abbreviation pageAcronymns and abbreviations
Stn ComparatorStation Comparator
Rail newsNews Now - live rail news feed
Site Style 1 2 3 4
Next departures • Bristol Temple MeadsBath SpaChippenhamSwindonDidcot ParkwayReadingLondon PaddingtonMelksham
Exeter St DavidsTauntonWestburyTrowbridgeBristol ParkwayCardiff CentralOxfordCheltenham SpaBirmingham New Street
December 30, 2024, 17:24:42 *
Welcome, Guest. Please login or register.

Login with username, password and session length
Forgotten your username or password? - get a reminder
Most recently liked subjects
[121] The Wider Picture - making it wider, but also clearer, hopeful...
[91] Working from roam: more people logging on from UK airports and...
[56] Server Map (mark 2) - as from January 2025 - a technical intro...
[51] Weekend of 28th/29th December - Coffee Shop offline for engine...
[48] Terrible signalling error!
[41] Southern Railway to axe toilets from new train (BBC News 19/09...
 
News: the Great Western Coffee Shop ... keeping you up to date with travel around the South West
 
   Home   Help Search Calendar Login Register  
Pages: [1] 2
  Print  
Author Topic: Coffee Shop forum Server load - ongoing issues, being resolved by grahame  (Read 8210 times)
grahame
Administrator
Hero Member
*****
Posts: 43000



View Profile WWW Email
« on: May 05, 2024, 19:43:34 »

There appears to be a very heavy server load at present - I am investigating and think I know why - sorry about any sluggishness or broken connections in the next hour or so.
Logged

Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
bobm
Administrator
Hero Member
*****
Posts: 10162



View Profile
« Reply #1 on: May 05, 2024, 19:58:32 »

One way to spend a Sunday evening.

The hidden side of running a forum. 
Logged
grahame
Administrator
Hero Member
*****
Posts: 43000



View Profile WWW Email
« Reply #2 on: May 05, 2024, 20:45:08 »

There appears to be a very heavy server load at present - I am investigating and think I know why - sorry about any sluggishness or broken connections in the next hour or so.

It would seem Claude is taking an interest in our public content ... 32,000 requests so far since this morning

Quote
After working for the past few months with key partners like Notion, Quora, and DuckDuckGo in a closed alpha, we’ve been able to carefully test out our systems in the wild, and are ready to offer Claude more broadly so it can power crucial, cutting-edge use cases at scale.

Claude is a next-generation AI assistant based on Anthropic’s research into training helpful, honest, and harmless AI systems. Accessible through chat interface and API in our developer console, Claude is capable of a wide variety of conversational and text processing tasks while maintaining a high degree of reliability and predictability.

Claude can help with use cases including summarization, search, creative and collaborative writing, Q&A, coding, and more. Early customers report that Claude is much less likely to produce harmful outputs, easier to converse with, and more steerable - so you can get your desired output with less effort. Claude can also take direction on personality, tone, and behavior.

https://www.anthropic.com/news/introducing-claude

Claude in not alone - there are lots of other crawler around too.  I put some advice to them in a "robots.txt" file nut have to know about them first.   There things are a double edged sword - in some ways they are helping themselves uninvited to our data resource, but then they help our visibility.  Sometime you ask Alexa and it will tell you an answer from the Coffee Shop.



I think the load is under control - I will watch overnight though.
Logged

Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
grahame
Administrator
Hero Member
*****
Posts: 43000



View Profile WWW Email
« Reply #3 on: May 06, 2024, 06:59:22 »

There appears to be a very heavy server load at present - I am investigating and think I know why - sorry about any sluggishness or broken connections in the next hour or so.

Remains busy overnight with a high crawler / bot traffic; I have tuned a few settings but there's an element of that being a holding operation, and some of the tunings will take up to 24 hours to click into place.  I will take a further look in due course.
Logged

Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
Ralph Ayres
Transport Scholar
Hero Member
******
Posts: 399


View Profile
« Reply #4 on: May 06, 2024, 10:55:14 »

I've read all this and almost understood a couple of sentences!  Thanks to Grahame for keeping things on track for the rest of us to benefit.
Logged
johnneyw
Transport Scholar
Hero Member
******
Posts: 2455


From station to station, back to Bristol city....


View Profile
« Reply #5 on: May 06, 2024, 19:11:53 »

I've read all this and almost understood a couple of sentences!

One more than me! 
Logged
bobm
Administrator
Hero Member
*****
Posts: 10162



View Profile
« Reply #6 on: May 06, 2024, 19:55:39 »

Basically someone repeatedly knocking on the door, asking a question and without waiting for the answer asking another and another. 
Logged
grahame
Administrator
Hero Member
*****
Posts: 43000



View Profile WWW Email
« Reply #7 on: May 06, 2024, 20:20:54 »

Basically someone repeatedly knocking on the door, asking a question and without waiting for the answer asking another and another. 

Indeed - multiple people doing it, mind - and where there are multiple links in a page they will tend to go on and ask for lots of these follow ups in parallel.   "Nice" someones are a bit considerate ... others get selves labelled as "naught_boys" and that tends to land to restrictions or bans. 
Logged

Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
grahame
Administrator
Hero Member
*****
Posts: 43000



View Profile WWW Email
« Reply #8 on: May 21, 2024, 12:28:13 »

Basically someone repeatedly knocking on the door, asking a question and without waiting for the answer asking another and another. 

Indeed - multiple people doing it, mind - and where there are multiple links in a page they will tend to go on and ask for lots of these follow ups in parallel.   "Nice" someones are a bit considerate ... others get selves labelled as "naught_boys" and that tends to land to restrictions or bans. 

Still thousands of requests per hour - our server received 314,169 requests in the 24 hours to 03:30 this morning which is a average of over 200 per minute.   I don't actually want to stop answering too many of these queries, but some of the answers need a lot of compute resource and we (or rather I) may be able to reduce that somewhat, especially for the "someone"s who may not need a full, current answer.

First steps to sorting the issues are to analyse them, and for that purpose I am putting in a restriction on certain accesses to see what difference it makes - I'll run it for a few hours, then perhaps move the restriction. If you should happen to get unexpected error pages that persist for more than a few minutes, please let me know.
Logged

Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
TonyK
Global Moderator
Hero Member
*****
Posts: 6592


The artist formerly known as Four Track, Now!


View Profile
« Reply #9 on: May 21, 2024, 20:17:13 »

I came late to this party. I read and understood "There appears to be a very heavy server load at present", and got lost after that. I assume all is well again, and well done Graham!
Logged

Now, please!
grahame
Administrator
Hero Member
*****
Posts: 43000



View Profile WWW Email
« Reply #10 on: May 22, 2024, 07:21:44 »

I came late to this party. I read and understood "There appears to be a very heavy server load at present", and got lost after that. I assume all is well again, and well done Graham!

Thanks, TonyK and all the folks who have "like"d the thread.     

Monitoring / watching is ongoing with a flow of incoming requests and it's very much a question of being prepared to handle whatever hits us - and we can guess what that might be but are sometimes taken by surprise.

In many ways, there's a parallel with passengers arriving at a railway station.  Consider the Coffee Shop to be like Paddington for departing passengers.   

For the most part on the passenger side, it's pretty predictable how and when people will turn up and there are services they can board.  There may be occasional surges - for example from a news story / event, or when someone tells their friends what a marvellous service is offered.   Additional incoming infrastructure - the opening of the Elizabeth line - changes the flow of incoming passengers in ways in which the best we can do is try to predict the effect.

What sort of people will turn up?   They might be the nice, easy to handle passengers ... or they could have a disproportionate number with bicycles, heavy luggage, in wheelchair, or disproportionately needing the loo, all of which will put pressure on the station and service.

On the service side we have a number of platforms from which we can run services, a capacity of a number of trains, and also a number of tracks out on the line which have finite capacity, and each of these (disc space, memory and threads, and cpu resource) is finite.

Platforms, trains and tracks need some maintenance - not as much as the rail network, but things like a build up of rubbish needs clearing out from time to time, and we need to take copies of the setup so that if something goes wrong we have the tools to repair it.   And they also need monitoring and understanding so we are not taken by suprise when something does happen - either a fault or an overload, perhaps exacerbated by an out-of-pattern arrival of a lot of customers in a particular metric.  Lots of people with backpacks headed for Castle Cary ...

Where I flagged up a warning yesterday was that I was putting in various updates in the flow of how we get people onto trains and away, warning customers that there was a change I might accidentally turn some away.   We are still running within resources, but I was doing this as a proactive action to understand where the resources were / are going.

I have just started writing this and the parallels are surprisingly good - but there are differences.  If encouraged, I may continue to use the parallel for other explanations.

Logged

Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
Red Squirrel
Administrator
Hero Member
*****
Posts: 5447


There are some who call me... Tim


View Profile
« Reply #11 on: May 22, 2024, 09:37:38 »

This reminds me of Humphrey Lyttleton explaining ‘one song to the tune of another’ on I'm Sorry I Haven’t a Clue: So imagine that the lyrics are the passengers, and the tunes are the trains. Now what happens if a train is accidentally shunted into the wrong siding..?
Logged

Things take longer to happen than you think they will, and then they happen faster than you thought they could.
grahame
Administrator
Hero Member
*****
Posts: 43000



View Profile WWW Email
« Reply #12 on: May 22, 2024, 09:56:03 »

This reminds me of Humphrey Lyttleton explaining ‘one song to the tune of another’ on I'm Sorry I Haven’t a Clue: So imagine that the lyrics are the passengers, and the tunes are the trains. Now what happens if a train is accidentally shunted into the wrong siding..?

That could be either a frustrating failure to deliver, or a security breach!
Logged

Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
TonyK
Global Moderator
Hero Member
*****
Posts: 6592


The artist formerly known as Four Track, Now!


View Profile
« Reply #13 on: May 22, 2024, 11:43:26 »

I wonder - the big surge happened over a bank holiday weekend, and there's another coming up soon. The thought occurred to me after hearing a radio programme about the surge in the number of companies being registered at Companies House, using random addresses of unsuspecting British folk. The chap investigating this phenomenon found it was likely because of a recent ban, by their own government, on Chinese companies dealing in crypto currency. The traffic suddenly dropped for a few days, which he found were a public holiday in China. Could it be that the opposite works, and attempts at getting into networks rise when IT support staff are likelier to be fewer in number? Except at the Coffee Shop, of course.
Logged

Now, please!
grahame
Administrator
Hero Member
*****
Posts: 43000



View Profile WWW Email
« Reply #14 on: May 24, 2024, 06:41:43 »

I wonder - the big surge happened over a bank holiday weekend, and there's another coming up soon. The thought occurred to me after hearing a radio programme about the surge in the number of companies being registered at Companies House, using random addresses of unsuspecting British folk. The chap investigating this phenomenon found it was likely because of a recent ban, by their own government, on Chinese companies dealing in crypto currency. The traffic suddenly dropped for a few days, which he found were a public holiday in China. Could it be that the opposite works, and attempts at getting into networks rise when IT support staff are likelier to be fewer in number? Except at the Coffee Shop, of course.

There used to be a pattern hour by hour with the server consistently three times as busy during the day than at night, and with weekend peaks about a half of those during the week. That was accounted for by it being primarily a server of specialist information for IT professionals.    It's a different server these days, of course - we're in the cloud and with a lot more compute power to handle requests, but the nature of those requests, and indeed what we serve, has changed.

If you see an address ending in ".html" it is highly unlikely it will actually be a file on a disc or in a memory somewhere that contains hypetext markup language - rather it will be a program that fills in a template with data from a database, sometimes with significant compute involved and with those databases having a long archive in them which influences the output.    The same applies to most of the ".jpg" images we serve.

Crawlers, bots and spiders - automata which for the most part are the same thing as each other - typically grab a ".html" page, look through it for hypertext links to other .html pages, go off and grab those, and so on until they have grabbed the whole public site.   We want them to do so, for the most part - at least the more useful content. Yes, I want to people to be offered Coffee Shop content when they search on Google, or ask Alexa ... I probably want plagiarism searches that the Universitys use to reveal that a student has copied from what a member wrote in 2016.  And I would like chatbots to be informed by our content too. 

But

* Compute power and storage is much cheaper these days, and so there are lots of spiders crawling around - almost an infestation

* The number of different pages (URLs) that we have ever grows with our long historic record

* The cost of servicing each individual request grows as each individual request needs to consider that individual record in the database.

* Extra facilities added

Combined, those three four are like cube quadratic rule, and the increases in data volume are phenomenal.  The image database which was set up because it was getting a bit big as folder with perhaps 500 pictures now serves some 20,000 using several gigabytes of storage.   Our little forum has over a quarter of a million messages on it to be searched through.   The new passenger flow system that I put in the other week has some 4 million records to analyse and sort and there are some 10,000 different pages for the spiders to find, each  of which "has to" to go through those records and sort the results.

Server load (just one of a number of monitors) from last night - ideally it should be at or below 1 job queuing at any time

The hard black line is yesterday ... the coloured lines are previous days

How do we control the load? Methods include:
1. We tell benign crawlers to avoid some areas on the server
2. We identify some crawlers and send them cached (slightly old) data rather than regenerating every time
3. We identify some crawlers and return a "go away"
4. Extra code added to quickly eliminate data in searches efficiently
Logged

Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
Do you have something you would like to add to this thread, or would you like to raise a new question at the Coffee Shop? Please [register] (it is free) if you have not done so before, or login (at the top of this page) if you already have an account - we would love to read what you have to say!

You can find out more about how this forum works [here] - that will link you to a copy of the forum agreement that you can read before you join, and tell you very much more about how we operate. We are an independent forum, provided and run by customers of Great Western Railway, for customers of Great Western Railway and we welcome railway professionals as members too, in either a personal or official capacity. Views expressed in posts are not necessarily the views of the operators of the forum.

As well as posting messages onto existing threads, and starting new subjects, members can communicate with each other through personal messages if they wish. And once members have made a certain number of posts, they will automatically be admitted to the "frequent posters club", where subjects not-for-public-domain are discussed; anything from the occasional rant to meetups we may be having ...

 
Pages: [1] 2
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC Valid XHTML 1.0! Valid CSS!
This forum is provided by customers of Great Western Railway (formerly First Great Western), and the views expressed are those of the individual posters concerned. Visit www.gwr.com for the official Great Western Railway website. Please contact the administrators of this site if you feel that the content provided by one of our posters contravenes our posting rules (email link to report). Forum hosted by Well House Consultants

Jump to top of pageJump to Forum Home Page