grahame
|
|
« on: November 22, 2015, 17:31:03 » |
|
I'm looking at a potential fix to the spewing character problem (technically it's an issue with character sets) that's been bugging us for a while, where special characters in older posts are becoming damaged when we backup / restore. The problem is with old forum software, a very much more recent database than it was ever designed / tested for, and a technical admin who looks after this site as a hobby and let it happen and get a bit out of hand (you can also blame his lack of technical expertise in one of the technologies). However, before applying a fix I would like to check with members. Here are the options on offer: Option 1: Status Quo - make no change which means that on any backup / restore cycle the number of special characters in each rubbish sequence goes up between 33% and 50% (that's something that's every few months) Option 2: Replace special character sequences on past posts (and on future posts) with a cardinal character such as ^. This would show up as follows: (See original at) http://www.firstgreatwestern.info/coffeeshop/index.php?topic=12532.0Option 3: Neither above option acceptable - so have someone else take a look at the problem / see what they can up with if anything. I can't know / predict any outcome from this option, but the issue is a significant one. Members of other forums may have noted that they have gone as far as closing old forums and opening new ones to overcome issues; considerable time and effort and loss of historic data comes from that, with no guarantee (from me) that I would have the time / resource to be anything like as involved in the extra peak of work involved that I have in the past. Potentially there are cost / hosting issues too - but then if the vote goes for this option, I would suggest a further discussion (or perhaps we can have it here?) as to what the terms of reference of the "take a look" commission would be and how we might appoint and potentially pay for such a look / see if it's not something that's in the volunteer sector. For once, I will be voting in my own poll - and that will be for option 2. It seems to provide a solution to an ongoing problem that has the potential to get worse to the extent it could damage the forum's content and - in time - the very future. Indeed I think we may have already lost (but I can potentially roll back) one block of posts from about 7 or 8 years back. Option 1 is driving on towards a wall into which we could crash rather nastily in the medium term. Option 3 could potentially be a brave new perfect start, but you (members) would loose a lot of traction along the way and would need additions to (or a new) technical support person depending on how it went.
|
|
|
Logged
|
Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
|
|
|
bobm
|
|
« Reply #1 on: November 22, 2015, 18:10:03 » |
|
I too back option 2 - seems a happy compromise between what we have now and a very expensive and time consuming operation which may not provide a solution. (Anyone who starts comparing Status Quo with other rock bands will be given a thread to manually replace each extraneous character one by one! )
|
|
|
Logged
|
|
|
|
Bmblbzzz
|
|
« Reply #2 on: November 22, 2015, 18:23:47 » |
|
It's not just old posts, I've seen it on some quite recent ones. But I don't really understand why it happens. This is on SMF▸ , isn't it? I know that other forums running on (ancient versions of) SMF don't have the same problem.
Anyway, Option 2 seems better than Option 1.
|
|
|
Logged
|
Waiting at Pilning for the midnight sleeper to Prague.
|
|
|
grahame
|
|
« Reply #3 on: November 22, 2015, 18:32:27 » |
|
It's not just old posts, I've seen it on some quite recent ones. But I don't really understand why it happens. This is on SMF▸ , isn't it? I know that other forums running on (ancient versions of) SMF don't have the same problem.
Anyway, Option 2 seems better than Option 1.
Thank you. It's not really an SMF issue ... it's basically my workings on backup and restore on the occasions that the database has been flakey mixing up character sets. If the vote suggests I go ahead, then I'll also take a look and see what I can do to avoid the problem breeding again now that I know what it is ... I HAVE considered reverse engineering to unwrap but it's so wrapped up that's beyond me, frankly.
|
|
|
Logged
|
Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
|
|
|
trainer
|
|
« Reply #4 on: November 22, 2015, 19:22:06 » |
|
As an ordinary 'punter' on the Forum, with no understanding of the technicalities, but annoyed with the disrupted messages from time-to-time, I simply wish you to take the most straightforward action to minimise the issue. Much voluntary time is spent allowing people like me to dip in and out with no responsibility other that to make comments, for which I (and I am sure most others) are very grateful. I accept that this is a challenging matter to change and am reluctant to ask for any course of action which would mean someone else spending hours of time making my occasional time on here marginally better. I will be grateful for any improvement. Thanks all you technical people for what we do have. I also have a sneaking satisfaction when those who 'know all about IT' are beaten by it.
|
|
|
Logged
|
|
|
|
Chris from Nailsea
|
|
« Reply #5 on: November 22, 2015, 23:19:58 » |
|
|
|
|
Logged
|
William Huskisson MP▸ was the first person to be killed by a train while crossing the tracks, in 1830. Many more have died in the same way since then. Don't take a chance: stop, look, listen.
"Level crossings are safe, unless they are used in an unsafe manner." Discuss.
|
|
|
GBM
|
|
« Reply #6 on: November 23, 2015, 10:53:20 » |
|
Totally agree with trainer. Option 2 for me please.
Great pleasure for me to read and occasionally post.
Thank you all "admin" who run the forum
|
|
|
Logged
|
Personal opinion only. Writings not representative of any union, collective, management or employer. (Think that absolves me...........)
|
|
|
Rhydgaled
|
|
« Reply #7 on: November 23, 2015, 11:51:31 » |
|
Option 2 for me as well, I think. Just a couple of questions though: - Would this replace only the garbage sequences or would the code (I assume this would be an automated replacement) not be able to distingish them from potentially useful characters (in other words, which characters would be replaced)?
- Would/could the replacement character also start to breed?
|
|
|
Logged
|
---------------------------- Don't DOO▸ it, keep the guard (but it probably wouldn't be a bad idea if the driver unlocked the doors on arrival at calling points).
|
|
|
grahame
|
|
« Reply #8 on: November 23, 2015, 12:04:58 » |
|
Would this replace only the garbage sequences or would the code (I assume this would be an automated replacement) not be able to distingish them from potentially useful characters (in other words, which characters would be replaced)? Testing has shown that we are unlikely to loose anything useful ... I have tried out much more tun the sample before I suggested it Would/could the replacement character also start to breed?
No
|
|
|
Logged
|
Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
|
|
|
Red Squirrel
Administrator
Hero Member
Posts: 5452
There are some who call me... Tim
|
|
« Reply #9 on: November 23, 2015, 13:35:32 » |
|
Just a thought: Don't get rid of all the 'special characters' from this forum - some of us have nowhere else to go...
|
|
|
Logged
|
Things take longer to happen than you think they will, and then they happen faster than you thought they could.
|
|
|
Rhydgaled
|
|
« Reply #10 on: November 23, 2015, 14:05:49 » |
|
Would this replace only the garbage sequences or would the code (I assume this would be an automated replacement) not be able to distingish them from potentially useful characters (in other words, which characters would be replaced)? Testing has shown that we are unlikely to loose anything useful ... I have tried out much more tun the sample before I suggested it Would/could the replacement character also start to breed? No Sounds good.
|
|
|
Logged
|
---------------------------- Don't DOO▸ it, keep the guard (but it probably wouldn't be a bad idea if the driver unlocked the doors on arrival at calling points).
|
|
|
grahame
|
|
« Reply #11 on: December 22, 2015, 07:04:18 » |
|
I am aware that I've not actioned this yet ... I'm seeing it as important but not time-critical and have had my plate rather full. It also needs doing at a time that I've got excellent and continuous net access, when the forum is quiet, and when I'm feeling bright, well and awake enough to do it without significant interruptions. I'm writing this update from (!) yet another hotel room, with a stinking cold and a course to give ... not the morning to give it a go. Besides - it's a commuter morning and if I take the site down, chances are that signalling will pop in the Thames Valley, or the windscreen wipers will fail on 153369 again!
P.S. Well done to the Great Western operational team who got another 153 down to Westbury in time to run the busiest TransWilts service of the day yesterday evening. Units will occasionally break down, and it's really appreciated when actions are taken to get a replacement in sooner rather than later.
|
|
|
Logged
|
Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
|
|
|
grahame
|
|
« Reply #12 on: December 25, 2015, 04:06:51 » |
|
OK - let's see how that worked ... process went well / almost too well. I am a bit concerned at just how much the database table dropped in size - it may be that there was an awful lot of corruption in old posts, or I may have done some damage. If something turns up that's unfortunate, I do have backups. PLEASE let me know of any issues.
|
|
|
Logged
|
Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
|
|
|
Chris from Nailsea
|
|
« Reply #13 on: December 25, 2015, 20:43:47 » |
|
|
|
|
Logged
|
William Huskisson MP▸ was the first person to be killed by a train while crossing the tracks, in 1830. Many more have died in the same way since then. Don't take a chance: stop, look, listen.
"Level crossings are safe, unless they are used in an unsafe manner." Discuss.
|
|
|
grahame
|
|
« Reply #14 on: December 26, 2015, 08:07:51 » |
|
OK - looks good after a further 24 hours, and overnight I came up with a further test to check our that we hadn't lost, at least, whole messages: mysql> select count(id_msg) from smf_old_messages; +---------------+ | count(id_msg) | +---------------+ | 185432 | +---------------+ 1 row in set (0.39 sec)
mysql> select count(id_msg) from smf_messages; +---------------+ | count(id_msg) | +---------------+ | 185459 | +---------------+ 1 row in set (0.11 sec)
Also tells me we've had 37 posts since the early hours of Christmas morning. We can probably consider the matter concluded ... we may see a few re-appear in new posts that use special characters, but having fixed the issue once it's very easy to do it again if it has to be - and next time would be very quick and easy. For (my) record later - code used: open FH,"preclean.sql"; while ($line = <FH>) { $line =~ s/[\x80-\xff]{2,}(?:[\x00-\x7f][\x80-\xff]{2,})*/^/g; print ($line); }
|
|
|
Logged
|
Coffee Shop Admin, Chair of Melksham Rail User Group, TravelWatch SouthWest Board Member
|
|
|
|