Jump to content

Question about modules creating MySQL tables with different charsets.


Recommended Posts

Okay here goes (again)...

 

If you can't tell the theme of some of my posts lately they are all about utf8 support and why does my data start getting littered with garbage characters?  I've been working for over three weeks on trying to understand the process of character sets, collation and character collation connection cradle to grave.  I see that a clean install of Prestashop uses:

 

InnoDB tables

UTF-8 charset

utf8_general_ci collation

 

After installing modules (some free some paid it happens on both) I see some of the following:

 

MyISAM tables

latin1 charset (at the table and column level)

latin1_swedish_ci collation

utf8_unicode_ci collation

 

It's then that I start seeing garbage and it just gets worse the more that I try to fix it:

 

<a href="http://capvapor.com/8-efest" title="Shenzhen Fest Technology Co, LTD was first established in 2009 with our main focus on chargers and Li-ion rechargeable batteries including LiMn ( IMR), Nickel Metal Hydride (Ni-MH) and LiFePO4 batteries currently in use for many different applications today including Flashlights, E-cigs &amp; Vaping Mods, Power Tools, Solar lighting, emergency lighting, cordless phone, etc...
Quality is of utmost importance in the continued development and growth 8of Efest Brand of Shenzhen Fest Technology .     Efest batteries are approved by CE and Comply with RoHS and our factories hold certification in compliance with ISO9001/2000. Over 90% of our products are exported overseas globally each year  .">eFest</a>

 

So after all this time I don't think there is anywhere else that I can change the settings in PHP/MySQL/Apache to "force" UTF-8 charset, connection encoding or collation.  I see that inside some modules the PHP is creating the tables there is engine=MyISAM and charset=latin1:

 

function installDB()
{
 
$query = 'CREATE TABLE IF NOT EXISTS `'._DB_PREFIX_.'AAAAAAA_lang` (
 `id_AAAAAAA` int(11) DEFAULT NULL,
 `id_lang` int(11) DEFAULT NULL,
 `age` varchar(255) DEFAULT NULL,
 `url` text,
 `title` text,
 `reason` text,
 `mentions` text,
 '.(  (version_compare(Configuration::get('PS_INSTALL_VERSION'), '1.5.0') >= 0)  ?  "" : "`dateformat` text,`remember` text,").'
 `trust` text
) ENGINE=MyISAM DEFAULT CHARSET=latin1';
Db::getInstance()->Execute($query);
 
 
 
 
$query = 'CREATE TABLE IF NOT EXISTS `'._DB_PREFIX_.'AAAAAAA` (
 `id_AAAAAA` int(11) NOT NULL AUTO_INCREMENT,
  '.(  (version_compare(Configuration::get('PS_INSTALL_VERSION'), '1.5.0') >= 0)  ? "`dateformat` text,`remember` text," : "").'
 PRIMARY KEY (`id_AAAA`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=8 ;
';
Db::getInstance()->Execute($query);

 

 

So I want to know if these are the root cause of the garbage and what if any issues it would cause if I modified this to create the table using utf8 and even InnoDB?

 

I just don't understand why these people don't use the environment variables to create these tables?  I see that they are using other globals and if this is setting the connection collation to something other than utf8 then this could be the root cause of my issues.

 

I am using 1.5.6.2 and I am pulling my hair out so if anyone could help me get this figured out I would greatly appreciate it!

 

If you need anything else from me please let me know.

 

Thank you,

James Stepp

Edited by jstepp28 (see edit history)
Link to comment
Share on other sites

Thank You for the Quick Response!!!  I can't tell you how frustrating it is to start over building your site after finding all the my.cnf, php.ini, httpd.conf, .htaccess, etc. places to tell the system to use utf8 charset, collation, and connection then start having issues all over again due to people not following proper coding when creating tables.

 

Should I change this before creating the tables or let it install and create the tables then alter them in mysql?  Do you see any issue with changing the engine to InnoDB?

 

I just want a "best practice" procedure for installing modules in order to keep my database clean from now on.

 

I can press on and hopefully get this shop live in the next few days now thanks to your help Vekia!

 

-James Stepp

Link to comment
Share on other sites

I have one more question since we are on a roll here...

 

Is there a specific PDO connection string in classes => db => DbPDO.php that correctly sets up the connection to utf8 like:

 

My PHP version is:

PHP Version 5.4.28

 

"mysql:host=$host;dbname=$db;charset=utf8"

 

Or for PHP versions < 5.3.6, since the charset option was ignored you must do it like this (is this supported in PHP 5.4.X):

$dbh = new PDO("mysql:$connstr", $user, $password);
$dbh->exec("set names utf8");

 

I just don't see it but I'm learning PHP so I may not be looking in the right place.  I would love a script that would give me the current encoding, connection charset, and collation in the debug info or maybe even on the page since it is so important:

Advanced Parameters xseparator_breadcrumb.png.pagespeed.ic.i Configuration Information xseparator_breadcrumb.png.pagespeed.ic.i View 

I just want a solid way for the application to report all of the current mysql encoding (utf8) settings to ensure that it is setup right and stays that way...

 

Thanks again ,

-James Stepp

Link to comment
Share on other sites

Is there a module that anyone knows of that can do what I'm asking?  I don't mind paying for something that can do what I need to in order to verify this is setup right and stays right.

 

Anybody have any ideas on how to check the connection to mysql or how it is is created using the PDO mysql connection?  I think this is so important that there should be a way to verify that this is right and that it doesn't get changed.  I have a hard time believing that I am the only person that is having problems with this.

 

It doesn't look like the PDO connection dsn is created using the charset=utf8 parameter and if it isn't then could it be added?  If it cannot then is executing a SET NAMES = utf8 being done?  I have been researching the proper method to connect and set the required parameters for a utf8 connection using PDO:

 

Connecting to MySQL

old way:

<?php
$link = mysql_connect('localhost', 'user', 'pass');
mysql_select_db('testdb', $link);
mysql_set_charset('UTF-8', $link);

new way: all you gotta do is create a new PDO object. PDO's constructor takes at most 4 parameters, DSN, username, password, and an array of driver options.

A DSN is basically a string of options that tell PDO which driver to use, and the connection details... You can look up all the options here PDO MYSQL DSN.

<?php
$db = new PDO('mysql:host=localhost;dbname=testdb;charset=utf8', 'username', 'password');

Note: If you get an error about character sets, make sure you add the charset parameter to the DSN. Adding the charset to the DSN is very important for security reasons, most examples you'll see around leave it out. MAKE SURE TO INCLUDE THE CHARSET!

You can also pass in several driver options as an array to the fourth parameters. I recommend passing the parameter which puts PDO into exception mode, which I will explain in the next section. The other parameter is to turn off prepare emulation which is enabled in MySQL driver by default, but really should be turned off to use PDO safely and is really only usable if you are using an old version of MySQL.

<?php
$db = new PDO('mysql:host=localhost;dbname=testdb;charset=utf8', 'username', 'password', array(PDO::ATTR_EMULATE_PREPARES => false,
PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION));

You can also set some attributes after PDO construction with the setAttribute method:

<?php
$db = new PDO('mysql:host=localhost;dbname=testdb;charset=utf8', 'username', 'password');
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$db->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
 

So if anyone knows an answer to this I would appreciate some direction or if I need to buy a module to check or make these changes then so be it but why wouldn't this be standard operating procedure?

 

Thanks to everyone...

James Stepp

Link to comment
Share on other sites

My most pressing question about module installation:

 

Which is considered a best practice about installing new modules-

 

1) I need to know if it is best to install new modules with the module's default charset/encoding then convert to utf8.  

 

OR

 

2) Should I convert the code prior to initial installation/creation of the mysql tables to reflect utf8 and change the engine to InnoDB?

 

The rest of my questions and concerns would not be an issue as I understand it if I can keep the database clean from a charset/encoding standpoint.

 

If someone can confirm which method is best or if there is a better way than either of these I would appreciate it.

 

I don't want to introduce any more issues while trying to fix this one if I can keep from it and this is causing me the most issues that is keeping me from feeling comfortable using Prestashop.  My database tables are getting corrupted with extended or double-encoded characters that is keeping me from being able to go live.

 

Thank you,

James Stepp

Link to comment
Share on other sites

That is what I was thinking too.  I just wanted to bounce this off someone since I've been testing both methods and there is a lot more involved with converting after the tables and columns have been created using the latin1 charset and latin1_swedish_ci collation but I wanted to make sure.

 

Thanks for the quick response and I will let you know if I see any issues or if this resolves the root problem...

 

-James Stepp

Link to comment
Share on other sites

×
×
  • Create New...