Great Western Coffee Shop

Sideshoots - associated subjects => News, Help and Assistance => Topic started by: grahame on July 30, 2011, 07:05:59



Title: Server outage problem this morning
Post by: grahame on July 30, 2011, 07:05:59
The server on which the "Coffee Shop" is hosted may be unavailable for a period from lunchtime today.  I hope that any interruption to service will be a short one.

Update / news - please bookmark
http://www.wellho.net/share/coffeeshop.html
which is on a different domain / server (hosted in a different country, in fact) so will be available throughout.


Title: Re: Server outage problem this morning
Post by: grahame on July 31, 2011, 05:20:47
Issue resolved without any interruption of service.

It's still a good idea for regulars to bookmark the status page just in case of main server issues.


Title: Re: Server outage problem this morning
Post by: grahame on September 26, 2011, 15:59:49
Due to infrastructure issues, our server was offline for a couple of hours early this afternoon.   Sorry about the break in service - happy to be back  ;)


Title: Re: Server outage problem this morning
Post by: 6 OF 2 redundant adjunct of unimatrix 01 on September 26, 2011, 16:02:50
im getting no current running map, not sure if this is connected or google chrome having a moment


Title: Re: Server outage problem this morning
Post by: grahame on September 26, 2011, 16:42:48
Running for me ... and from the link:

Quote
17:00 Brighton to Bristol Temple Meads due 20:29

This train will be started from Barnham.It will no longer call at: Brighton, Hove, Shoreham-By-Sea and Worthing.This is due to train crew having been unavailable earlier. Last Updated: 26/09/2011 15:53

Is it just me, or has this particular train been a frequent victim at its Eastern end of late?


Title: Re: Server outage problem this morning
Post by: grahame on September 29, 2011, 18:30:47
And again ...

Quote

17:00 Brighton to Bristol Temple Meads due 20:29

This train will be started from Barnham.It will no longer call at: Brighton, Hove, Shoreham-By-Sea and Worthing.This is due to an earlier trespass incident. Last Updated: 29/09/2011 15:42

Looking back at the old diagrams from the top of these pages, it appears that the 17:00 has failed to leave Brighton on 1st, 3rd, 5th, 15th, 24th and 30th August, and on 2nd, 16th, 26th and 29th September.  On one extra date (12th August) , it did leave Brighton but didn't get beyond Salisbury.  Is all this lot just co-incidence, or is this train "top of the chops"?  I note the the 17:00 from Brighton is followed by another train at 17:03 that follows it as far as Havant.


Title: Re: Server outage problem this morning
Post by: Brucey on September 29, 2011, 20:15:42
You aren't alone in thinking this Graham.  A couple of years ago, I made the same observation with the Brighton services - ending/starting short at either end.  It happened for a little while but then didn't occur so often.

I'm not sure what the consequences would be if a Brighton service were to run late and/or get stuck somewhere outside FGW/SWT territory?


Title: Re: Server outage problem this morning
Post by: JayMac on September 29, 2011, 20:26:17
A big taxi bill getting crew to/from the unit for starters!



Title: Re: Server outage problem this morning
Post by: grahame on July 09, 2012, 17:52:21
One of our servers will be upgraded overnight sometime within the next week, and you may notice the disruption map, qr codes, many images and a few other features not being available. 


Title: Re: Server outage problem this morning
Post by: bobm on July 25, 2012, 15:52:12
Is this still going on?  It seems on odd occasions over the last few days I've not had an email about a new reply to a topic despite the "notify" option being on. It's not consistent. Sometimes I do sometimes I don't. Can't see a pattern to it.


Title: Re: Server outage problem this morning
Post by: grahame on July 25, 2012, 16:27:58
Is this still going on?  It seems on odd occasions over the last few days I've not had an email about a new reply to a topic despite the "notify" option being on. It's not consistent. Sometimes I do sometimes I don't. Can't see a pattern to it.

Nope - it was out for a few minutes on the night after the message, and that was all.  We had a database problem that I caught within a few minutes the other evening, caused by our MySQL tables filling the server's disc;  I've deleted some old error logs and got some disc space back, but with over 100,000 posts the site's rather bigger than was planned when it was started.   The other thing is that night owls may have issues at 10 to 20 past 3 in the mornings, and there are additional backups every couple of hours during the day too when it may be slower to a minute or so.


Title: Re: Server outage problem this morning
Post by: grahame on October 25, 2012, 05:16:43
Server outage for approx 100 minutes during the night - sorry about that to any night owls. An automated procedure that should not be visitor-visible left the system offline as far as I can see so far.  The status page at http://www.wellho.net/share/status.html (on a different server to the coffee shop!) is used to log problems ... feel free to bookmark it.


Title: Re: Server outage problem this morning
Post by: bobm on October 25, 2012, 08:42:14
It demonstrated a weakness in the new IOS 6 software on iPhones/iPads.  When trying to look at the site during the outage it said you couldn't because the device was not connected to the internet!


Title: Re: Server outage problem this morning
Post by: JayMac on October 25, 2012, 10:52:44
I was one of the night-owls, doing some historical browsing of forum threads. Slightly non-plussed when I starting getting 'unable to connect' returns. I always think initially that the problem is my end.

Glad to hear it was nothing major.


Title: Re: Server outage problem this morning
Post by: LiskeardRich on October 25, 2012, 17:01:42
I was also a night owl. Woke at around 0430 couldnt get back to sleep so was going to browse some old threads that may be interesting from before my time began on here.
Absolutely typical waking that early on my day off,, yet when I need to wake at around 6am I really struggle getting up!


Title: Re: Server outage problem this morning
Post by: grahame on October 25, 2012, 17:53:12
I was also a night owl. Woke at around 0430 couldnt get back to sleep so was going to browse some old threads that may be interesting from before my time began on here.  ...

I know ... and you were on very quickly once we were back up.   I've just checked the access pattern out of interest ... Google Analytics reports up to 80 different visitors per hour during the day, and that drops during the night to just a handful - but it's hardly ever zero (certainly hasn't ben for the last week!).   There's always a handful of people signing up in the middle of the night from places where it's daytime, hoping to be able to add "forumspam", but it's rare for an hour to go by without seeing any members at all.


Title: Re: Server outage problem this morning
Post by: grahame on March 30, 2013, 07:39:32
One of our servers (not the main one that hosts the Coffee Shop) suffered a catastrophic failure last night, and I'm taking the opportunity of Easter to take a little longer to restore it than I normally would - applying updates and upgrades before it goes back fully live.

On the "Coffee Shop", you'll find that the current feed of journeys is broken, and that images I've posted and my avatar will give you broken image symbols - there may be a couple of other minor effects too.  I'm anticipating that the server will progressively return to service over the next 12 hours ... will update you here


Title: Re: Server outage problem this morning
Post by: grahame on March 30, 2013, 17:34:27
... will update you here

Mostly back ... QR codes, images and the current running map which are fed by that server are now back in operation.


Title: Re: Server outage problem this morning
Post by: grahame on June 12, 2013, 08:30:14
I am getting unconfirmed reports of issues and there's a chance the server may go down.   Please note the following URL on another server for status.

http://www.wellho.net/share/status.html


Title: Re: Server outage problem this morning
Post by: grahame on June 13, 2013, 10:58:27
Our server went offline at around 12:30 yesterday (12th June) and is now back for testing at 11:00 (13th June).  We're on a different server with a new operating system and new versions of databases.


Title: Re: Server outage problem this morning
Post by: grahame on June 13, 2013, 11:01:46
Our server went offline at around 12:30 yesterday (12th June) and is now back for testing at 11:00 (13th June).  We're on a different server with a new operating system and new versions of databases.

Well ... that posted OK  ;D

We may have lost any contributions made between 11:30 and 12:30 yesterday (most recent backup to system failure) - sorry about that, chaps and chapesses, and any post you make this afternoon are not guaranteed to remain (not that they ever are, but I'm just testing at the moment on here)

To login, you may need to get an email reminder sent and reset your password.   Your previous login cookie will certainly be invalid.  And when you come to re-login, please use the top left boxes.


I am aware of a few issues and am looking at some of them.   And I will come back and add a more complete as to what hit us yesterday.    Prime concentration is on getting up and running again and letting people know that we're still here  ;D



Title: Re: Server outage problem this morning
Post by: Andrew1939 from West Oxon on June 13, 2013, 16:33:02
Am I going blind? I have recently had news about progress on the new Hanborough Car park and there has been an ongoing heading for parking at Charlbury and Hanborough stations but I can't seem to see it since yesterday's problems.


Title: Re: Server outage problem this morning
Post by: grahame on June 13, 2013, 17:09:17
Am I going blind? I have recently had news about progress on the new Hanborough Car park and there has been an ongoing heading for parking at Charlbury and Hanborough stations but I can't seem to see it since yesterday's problems.

See http://www.firstgreatwestern.info/coffeeshop/index.php?topic=12535.msg134428#msg134428

As noted in this thread too, I am aware of these issues.  It's been quite a traumatic 36 hours; I do have the data but I am not rushing to put it up in case I manage to screw it up royally. Will probably fall asleep early this evening, do a complete backup as we stand and then load the extra stuff (including some extra parking at Charlbury) in the early hours.


Title: Re: Server outage problem this morning
Post by: Chris from Nailsea on June 13, 2013, 23:18:31
Many thanks for your obviously very strenuous efforts to get the Coffee Shop forum 'up and running again', grahame.  :o


Title: Re: Server outage problem this morning
Post by: grahame on June 14, 2013, 08:24:20
Posts / threads back (all, I think) including the popular Reading thread and the prize quiz. Next - a similar thing to sort out personal messages, within the next 24 hours. 


Title: Re: Server outage problem this morning
Post by: Red Squirrel on June 14, 2013, 08:55:54
Your hard work is very much appreciated!

If anyone reading this hasn't managed to reset their password yet, please do check your spam or junk e-mail folder - that's where I found my reset notification.



Title: Re: Server outage problem this morning
Post by: grahame on June 14, 2013, 21:14:17
Follow up and updates at http://www.firstgreatwestern.info/coffeeshop/index.php?topic=12538.0 - similar topics in two threads so I'm locking this one (and testing in the process!)


Title: Re: Server outage problem this morning
Post by: grahame on July 10, 2013, 11:20:51
Sorry about the brief outage within the last half hour.   Issues were a networking problem at the London data centre which looks after our server.


Title: Re: Server outage problem this morning
Post by: grahame on January 01, 2014, 08:09:35
My apologies about the outage before dawn this morning (if anyone was up  ;D ) - we ran out of space during a big server backup.


Title: Re: Server outage problem this morning
Post by: grahame on January 25, 2014, 09:57:16
We had some server issues early this morning - this is an interim report as testing and investigation is still under way.    The effect was a loss of service on the Coffee Shop.

I have restored the system back to how it was at 01:30 this morning - so that means that any "night owl" posts and messages will have been lost.  I hope there weren't too many, but on usual form for a Saturday morning I would be very surprised ...

Because the issue is still under investigation, I cannot rule out the possibility of further posts and messages being deleted if we have to restore again, though I would put such a need in the 'unlikely' basket.  I suggest you keep a local copy of your posts just in case ;)




Edit note: One minor typo corrected, for clarity - 'night owl', rather than 'night own'. Thanks, grahame. CfN.


Title: Re: Server outage problem this morning
Post by: grahame on January 25, 2014, 14:19:45
The service has blown a second time, and I'm investigating further - potentially a deeper problem that I had thought / hoped.  At present, I'm seeing some of the results of a problem but not necessarily the route cause and I ned to continue to investigate.   Please bear in mind you may see intermittent service over coming hours.


Title: Re: Server outage problem this morning
Post by: grahame on January 25, 2014, 14:30:46
Test post ... (you may see a few of these) ...  ;D


Title: Re: Server outage problem this morning
Post by: grahame on January 25, 2014, 16:43:55
Ok - I think I have some clues as to what was happening, but I'll spare you the long technical explanation.   I'm fairly sure the fix applied has worked, but still looking / exploring as to why the fix was needed - if there's some underlying issue that could cause other "funnies".

Please carry on posting / using the forum as usual.  I have good reason to believe that posts will not be lost even if it goes belly up again, and if posts are lost it would only be a few minutes worth.


Title: Re: Server outage problem this morning
Post by: bobm on January 25, 2014, 21:54:49
As ever we give Graham a round of applause for what seems to have been a speedy resolution. (Touch wood).

Having had problems with forum software myself earlier this week I know what a minefield it can be - and an exciting challenge too - to fix it.  However at the back of your mind there is always a feeling that while you are doing your upmost to fix it there are people clicking impatiently on their mice trying to use the forum. 

However you will notice there were regular updates at the head of the page - how often do we comment on information being lacking at times of rail disruption....


Title: Re: Server outage problem this morning
Post by: Chris from Nailsea on January 26, 2014, 00:14:45
Indeed - many thanks to grahame for his sterling work in resolving these issues.


Title: Re: Server outage problem this morning
Post by: grahame on January 26, 2014, 07:00:49
As ever we give Graham a round of applause for what seems to have been a speedy resolution. (Touch wood).

Having had problems with forum software myself earlier this week I know what a minefield it can be - and an exciting challenge too - to fix it.  However at the back of your mind there is always a feeling that while you are doing your upmost to fix it there are people clicking impatiently on their mice trying to use the forum. 

However you will notice there were regular updates at the head of the page - how often do we comment on information being lacking at times of rail disruption....

I'm fairly sure the direct problem is fixed.  I do, however, remain less sure of why it occurred, and repeated. I think I know but there's an element of doubt that means I'll keep monitoring carefully for a while and leave database backups (which may effect performance noticeably) in place for a while.

Keeping the customer informed during disruption is something that I have learnt from the rail industry - often as examples of how not to do it.   First test when I noticed the forum was down was to see if it was general and at the server end or something local to my internet connection / browser, or specific to my login.   Once generally known, then I've learned the stages need to be:
a) Take action to stop any immediate problems getting worse
b) Let people know that there's an issue, and give an update time
c) Do some work to(wards) get(ting) the thing fixed
d) Update by the time already notified - even if it's just a further time for update
e) goto c)

Users sending me messages "do you know the forum is down" are helpful at the early stage - that may be how I learn of the problem, and also help me to be made quickly aware that it's a more general issue.  Messages with clues as to what happened / odd things noticed may be real gems in telling me something that I hadn't seen / noticed that's a big clue. But once the problems's noted and described, they become heartwarming in that I know people care, but a diversion from actually fixing the problem.    Thus the status pages and reports.

It actually costs a few minutes to do the status pages - and I have the luxury of being invisible / isolated from our users while I'm doing it.  And I don't have the mandated deadlines and the serious issue of people building up to get home that the rail industry has.  Additionally, forum users are well read, bright people and well know to us as friends for the most part with whom the sharing of a degree of background technical information and probabilities works well. These are not necessarily the metrics of rail users in general, so whether the technique described and used would work in the case of rail delays, I'm not sure.  "We don't know if we're going to have to cancel the 12:02 to Westbury - we will give you an update by 11:30" seems sensible but will everyone understand, and does the time taken in getting the message out mean there's less chance of it running due to a loss of fixing time?

Anyway ... I am still keeping a watching brief after yesterday ... and the status has been updated to reflect that.


Title: Re: Server outage problem this morning
Post by: SandTEngineer on January 26, 2014, 11:44:11
...yes thanks Grahame.  I got the 'Wellhouse' message at one point during logging on and that kept me well informed.  One small observation if I may.  The use of the yellow text in the forum header warning message is very difficult to read against the light green background (well at least to my older eyes).  Any chance it could be a darker colour?


Title: Re: Server outage problem this morning
Post by: grahame on January 26, 2014, 13:10:16
...yes thanks Grahame.  I got the 'Wellhouse' message at one point during logging on and that kept me well informed.  One small observation if I may.  The use of the yellow text in the forum header warning message is very difficult to read against the light green background (well at least to my older eyes).  Any chance it could be a darker colour?

Good point on the colour.  I just dropped in a fixed colour, overlooking that we have multiple styles and it may not have contrasted well in others.  I'll bear that in mind in future; in the immediacy I overlooked it and I'll take out the colour now as it's a less "hit em hard" message today.


Title: Re: Server outage problem this morning
Post by: SandTEngineer on January 26, 2014, 16:30:18
...yes thanks Grahame.  I got the 'Wellhouse' message at one point during logging on and that kept me well informed.  One small observation if I may.  The use of the yellow text in the forum header warning message is very difficult to read against the light green background (well at least to my older eyes).  Any chance it could be a darker colour?

Good point on the colour.  I just dropped in a fixed colour, overlooking that we have multiple styles and it may not have contrasted well in others.  I'll bear that in mind in future; in the immediacy I overlooked it and I'll take out the colour now as it's a less "hit em hard" message today.
Thanks for that.  I can read it now.......... ;)


Title: Re: Server outage problem this morning
Post by: stuving on January 27, 2014, 12:39:16
Graham - I didn't realise for most of yesterday that the coffeeshop was up, because www.firstgreatwestern.info itself was down (index page only). That seems to be still true. I'd imagined that was the easier part to get going again.


Title: Re: Server outage problem this morning
Post by: grahame on January 27, 2014, 12:52:07
Graham - I didn't realise for most of yesterday that the coffeeshop was up, because www.firstgreatwestern.info itself was down (index page only). That seems to be still true. I'd imagined that was the easier part to get going again.

Ah - thanks for spotting that.  It shows why we need regression testing to spot these things, but then there is so much more we could do and resources are limited.


Title: Re: Server outage problem this morning
Post by: grahame on January 27, 2014, 12:55:21
OK ... main page pointing to the Forum now (perhaps that's a good idea for the domain anyway) ... and acronym and abbreviations back.   I changed some passwords in my work on Saturday and overlooked that front system ...


Title: Re: Server outage problem this morning
Post by: anthony215 on January 27, 2014, 15:24:20
Whatever you guys did it seems to have made the various pages on teh forum load a lot more quicker than they used to.

Anway thanks Grahame for the hard work put in to fix the fault


Title: Re: Server outage problem this morning
Post by: grahame on February 16, 2014, 18:03:28
Due to networking problems at our hosting centre, our server was inaccessible for around 40 minutes ... just back and I'm testing, but as it was an service-provider wide problem I don't anticipate any local difficulties.   It MAY go up and down again a couple of time if they've just got a temporary fix or if they've cured a symptom not the root cause.

UPDATE - it was out for 55 minutes - 16:58 to 17:53.


Title: Re: Server outage problem this morning
Post by: grahame on February 17, 2014, 08:11:26
Overnight update ... a confirmation that the server ran perfectly all day yesterday ... it's just that it was unreachable for almost an hour because of network problems which effected all (?) sites hosted at the network centre where it's located. And, yes, I appreciate that such network problems made it useless for its intended purpose during that period.

We have backups, etc, and should a failure persist for longer in the future, we have the capability of being back on line elsewhere within 24 hours.  News is usually updated at http://melksh.am/status - and if there are wider problems check @transwilts, @wellho


Title: Re: Server outage problem this morning
Post by: bobm on February 17, 2014, 08:17:03
Overnight update ... a confirmation that the server ran perfectly all day yesterday ... it's just that it was unreachable for almost an hour because of network problems which effected all (?) sites hosted at the network centre where it's located. And, yes, I appreciate that such network problems made it useless for its intended purpose during that period.

Well it took down three of mine....but the sad thing is I noticed the Coffeeshop was down before my own!


Title: Re: Server outage problem this morning
Post by: thetrout on February 17, 2014, 21:30:08
Overnight update ... a confirmation that the server ran perfectly all day yesterday ... it's just that it was unreachable for almost an hour because of network problems which effected all (?) sites hosted at the network centre where it's located. And, yes, I appreciate that such network problems made it useless for its intended purpose during that period.

Well it took down three of mine....but the sad thing is I noticed the Coffeeshop was down before my own!

I noticed it as well. Also had intermittent problems with connectivity to my Dedicated Server yesterday but I think that was a slightly different issue. One set of IP Addresses worked and the others didn't... Most bizarre!


Title: Re: Server outage problem this morning
Post by: grahame on March 05, 2014, 05:10:08
Oh dear ... offline again from 03:17 to 04:45, due to the failure of the power supply in our dedicated server. Sorry to any night owls who noticed  ;)


Title: Re: Server outage problem this morning
Post by: grahame on March 28, 2014, 13:13:36
Glitch from 12:12 for about 45 minutes.   

Looking at what happened now; it appears that server rebooted (perhaps a power supply glitch?) and some of config setups when the disc was put into a new computer on 5th March hadn't been made permanent on reboot.

Please check notifications by email are working / let me know (via follow up here) if they do / don't work for someone.


Title: Re: Server outage problem this morning
Post by: grahame on March 29, 2014, 20:39:45
Some continuing issues and another shorter outage in the last hour.   I am stepping up the backup frequency while we look for a pattern then a fix; you may find occasional pauses in performance.


Title: Re: Server outage problem this morning
Post by: grahame on May 20, 2014, 03:57:35
Apologies (if anyone was awake  ;D )   for the outage from 01:42 to 03:42 ... the server hosting facility reported "The rack that this server was located in had suffered a partial loss of power".


Title: Re: Server outage problem this morning
Post by: grahame on October 31, 2014, 11:36:35
Out from 10:50 to 11:30 - sorry folks.   Reason under investigation.


Title: Re: Server outage problem this morning
Post by: grahame on January 19, 2015, 02:00:36
Out from around 9 p.m. until shortly after midnight.  Currently investigating what happened.



This page is printed from the "Coffee Shop" forum at http://gwr.passenger.chat which is provided by a customer of Great Western Railway. Views expressed are those of the individual posters concerned. Visit www.gwr.com for the official Great Western Railway website. Please contact the administrators of this site if you feel that content provided contravenes our posting rules ( see http://railcustomer.info/1761 ). The forum is hosted by Well House Consultants - http://www.wellho.net