Menu
Cart

XML import

Sooner or later, opencart-based store owner is faced with question: wich tool he can use to import price-list(s) of the supplier to update products in opencart quickly and correctly?

There are different modules to perform those tasks that specialize in processing one of file formats: csv, yml etc. Also, there are extensions which can processing price-lists of any format and arbitrary structure, but even they can’t fulfil all possible scenarios of price-list processing. In addition, such advanced tools often have redundant functionality and a rather complicated interface.

It is much better to make a special module to import a non-usual price-list. Stable, fast, predictable and not overloaded with settings.

This import module was written for Opencart 2.3 on an individual order. It is designed to import a large number of products and categories from an xml-file of non-usual structure (does not fit into the standards of price aggregators).

What we have:

  • XML-file size: aprox 50Mb
  • Number of categories: over 25,000
  • Number of products: over 55,000
  • Number of images: over 170,000 with a total size of about 15Gb

What is needed:

  • Manual import launch mode with the ability to take a file from a link
  • Display processing
  • Download new images
  • Reasonable processing time
  • Ability to set the same images to categories with the same names (the site structure is built on categories: lots of subcategories have the same names)
  • Ability to set a special layout to categories above the first level (the view of root categories and their child are different)
  • Update price and quantity of product
  • Disable missing categories
  • Reset (quantity = 0) missing products

The main problem when processing large price-lists is of course speed. Processing speed depends to many factors: from the web-server configuration to the structure and the number of mysql queries during the import process. In this case, the speed problem was depend to the huge number of images we should download. If you download images directly during the processing of the product, the import process will last several hours at best, at worst it will fail due to server timeouts.

A common mistake in dealing such tasks is the consistent processing of all at once (in a single run of the script). This approach can overload the web-server and also the risk getting various errors (Maximum execution time, 504 Gateway Timeout, Allowed memory size etc.) significantly increases. The simplest way to avoid  such errors is to use AJAX technology and process data in batches, not all at once. Moreover, due to asynchrony, the data processes in few threads, which significantly increases the speed of execution compared to the processing of “all at once”.

What is the result:

  • The user interacts with the server through AJAX.
  • It is possible to assign a layout to subcategories.
  • It is possible to assign an image to a category by the category name.
  • The progress of the current operation is displayed.
  • The import process goes in several steps:
    • Parsing the file and forming an array of all the data that needs to be processed
    • Import categories. New ones will be added, existed ones will be updated. Each part of data sent to the server  is a root category with all its child. That is, if there are 20 root categories, there will be 20 asynchronous ajax requests to the server.
    • Processing category relations (doing this as a separate request is much faster than processing each category separately).
    • Import products in batches of 10,000. New ones will be added, existed ones will be updated.
    • Downloading images to the server. It occurs in batches of 1,000 pcs in 16 threads. Multithreading is implemented using cURL Multi technology. Download speed is 1,200-1,500 per 1 minute. The speed can be even higher but then stability is lost and there is a risk of getting “broken” zero-sized pictures. Of course, the download speed also depends on the web server capabilities.
    • Disabling missing categories and zeroing missing products.
    • Completion of import and forming of results.
  • The primary import lasted a little over 1 hour.
  • Next launches to update information on the site last no more than 10 minutes (3-4 minutes if there are no new images or they are few).

This implementation of XML import in the opencart, of course, does not claim to be the most correct and fastest, since there can be a huge number of other ways to implement data import and the speed may be higher - it all depends on the specific situation. Here, such an approach fully proved itself and everyone was satisfied with the result.

So, if you need an opencart import module (similar or absolutely different) for any opencart version, contact us and we will help you to process a price-list of any complexity.

Write a review

Note: HTML is not translated!
Bad Good
Captcha
Tags: import , xml