PHP - Encoding issue when saving to XML file using SimpleXml
I am struggling with encoding issues in a PHP app that:
- Reads an XML file and parses it according to some rules
- Calls the Google Translate API and uses the result to populate a database that is later used to display data on the browser (that part works well)
- Saves that data to an XML file (it saves but there's something wrong with the encoding).
The data comes from Google Translate encoded in UTF-8 and in the browser, provided that you have the proper heading it displays fine whatever the language is.
Here's the Google Translate function:
function mt($text, $lang) {
$url = 'https://www.googleapis.com/language/translate/v2?key=' . $apiKey . '&q=' . rawurlencode($text) . '&source=en&target=' . $lang;
$handle = curl_init($url);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
curl_setopt($handle, CURLOPT_SSL_VERIFYPEER, false);
$response = curl_exec($handle);
$responseDecoded = json_decode($response, JSON_UNESCAPED_UNICODE);
$responseCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);
curl_close($handle);
if($responseCode != 200) {
$resultxt = 'not200result';
}
else {
$resultxt = $responseDecoded['data']['translations'][0]['translatedText'];
}
return $resultxt;
}
I'm using Simplexml to load an XML file, modify its contents and save it with asXml(). The generated XML file is encoded in something other than UTF-8 as it looks like this:
<value>ようこそ%0 ST数学</value>
Here's the code that attributes the translation to the XML node and saves it.
$xml=simplexml_load_file('myfile.xml'); //Load source XML file
$xml->addAttribute('encoding', 'UTF-8');
$xmlFile = 'translation.xml'; //File that will be saved
//Here I have a call to the MT function above and get it to the XML file at face value.
$xml->asXML($xmlFile) //save translated XML file
I've tried using htmentities() and played with utf8_encode() and utf8_decode() but can't make it work.
I've tried everything and looked at many other posts. For the life of me, I can't figure this one out. Any help is appreciated.