Image and data analysis

One of the techniques used by Elevate to deliver high quality results is image and data analysis. All images available in the data feed are automatically analysed.

Providing data in a truly structured manner is often not possible. In some cases, structure requires understanding of complex data relations which might not be supported in all systems. Furthermore, manual tagging is hard and much important information could be left out. This leaves important parts of a retailers' assortment unreachable for visitors, especially those with a clear purchase intent who use search or apply filters. This causes a poor experience and a lost sale opportunity.

Due to the importance of quality data, resources are often spent on creating product content both in terms of imagery and textual information. By automatically extracting important concepts from that data, Elevate is able to both complement and utilize this as if it were structured data.

1. The image analysis supports images in the following file formats: webp, png, jpg, and bmp. If you use image formats that we don't support, products with those images will be heavily ranked down and only one image will be returned from them. To solve an image file format issue, please contact your integrator.

2. Image analysis will only be performed on images with either a solid color or a gradient background.

Image analysis example

Color analysis

Over 400 different colors and nuances can be recognised by Elevate through image analysis. This is all done automatically using deep neural networks.

Image analysis is primarily used for fashion. For other industries, it can be better to use the colorOverride attribute on the product level to get the correct color representation.

Elevate can analyse a variety of image types including cutout images and images depicting fashion models. The analysis includes a built in understanding of color synonyms and how colors relate to each other. For example, bright red products won't be seen as "burgundy". A product in a darker shade of red will however automatically be considered as "burgundy" and "dark red" as well as "red" in both search and filter contexts.

The color analysis not only determines the color nuances of a product, but also the distribution. This means that an entirely white shirt will be recognized as more white than one striped blue and white.

Providing a colorDefault attribute in the data feed is beneficial for faster enabling of color search, filters, and swatches as this will take effect before the image analysis is completed. This means that colors that are available can be provided and if so, will be used until overridden by the automatic color analysis.

In cases with very complex imagery, such as boxes and packages that display other colors than the product, color analysis can be manually overridden using the colorOverride attribute.

Image availability

Proper imagery is crucial when it comes to visually driven product industries such as fashion and lifestyle. This is in general handled well but accidents with image handling do occur, and when they do, it needs to be resolved fast. To avoid products without images filling up a site's most valuable retail space, Elevate automatically scans image availability at regular intervals.

A product can have multiple images and a priority order for how these images are to be displayed. Should the highest prioritized image be faulty, a fallback mechanism kicks in and the second most prioritized image is displayed. Should none of the images be accessible, the product will automatically be heavily buried in all contexts. This gives a retailer minimum site impact during the time needed to work out any imagery issues.

Image resolution analysis

For performance reasons it is important to show images in the resolution appropriate for the context. Mobile devices typically use smaller images, thumbnail images are even smaller regardless of device, and for images in product listings it is important that the resolution is sufficiently high to showcase details. All images in the data feed are automatically analysed with respect to their width to provide a width value. This value can also correct an incorrect provided value.

Providing width for an image in the data feed, even if the width is not fully accurate, is beneficial for faster enabling of performance gains as these will take effect before the image analysis is completed. This means non-exact, guiding, widths can be provided that with time will be corrected if exact widths are inaccessible.

Image type analysis

Imagery depicting products normally come as either cutout or model images, meaning it either only displays the actual product with a plain background it is showcased on or together with a model. Proper separation of image types enables faster color analysis and improved image consistency for hover or image gallery features on product cards, as well as allows for more flexible image prioritization by merchandisers.

All images are automatically analysed with respect to their image type. An image type will automatically be provided to images without types, as any incorrectly given guiding image types will be corrected.

By providing a typeDefault or typeOverride attribute in the data feed, you enable performance gains as this will take effect before the image analysis is completed. The type_default attribute can be used to provide best guess guiding image types, whereas the type_override attribute is recommended if well structured data is available and the image types that are provided are known with a high degree of certainty.

The parameter typeOverride can be set to cutout, model, or misc. Images with the type set to misc will not be used for color analysis. An override to misc may be needed if the image has been incorrectly interpreted as one of the other two types.

Cutout images are used for color analysis when available, as they are preferred to ensure a more accurate color analysis. Model images are only used when no cutout images exist, and may lead to less accurate results.

Sometimes this classification is also incorrect, in which case you can set typeOverride to the correct type.

Size cleaning

Sizes often come in different scales, formats, and standards. Keeping sizes correct and easy-to-understand is especially important in industries such as fashion and lifestyle.

Size cleaning is currently only applied to clothing.

Getting size cleaning right manually is possible, but can take a very long time. Elevate comes with a size cleaning feature which automatically extracts and tries to interpret size information from un-curated size values in mixed formats from the data feed. However, this works even better when the size formats in the data feed are somewhat standardised.

Size cleaning in Elevate screenshot

Elevate's built-in size cleaning uses several methods to curate values, including:

Expanding SML size spans, such as `S-L` to values for `S`, `M`, and `L` respectively for the size facet while maintaining the correct presentation in the product card
Combining SML size values, such as `MEDIUM`, `MEDIUM`, , `medium`, `M`, `md` to a single value
Combining W/L (waist and length) values for trousers, such as `W32/L30`, `3230`, `32"/30"`
Normalising and combining values such as `One Size`, `one size`, and `O.S` to a single value
Handles numeric sizes with fit indicators. Recognizes and correctly sorts: "12 loose", "12 long", "12P", etc
Combining sizes for childrens' clothing, such as `110-116` and `110/116`
Combining descriptions of ages for childrens' clothing. Recognizes valid age-related suffixes (e.g., "yr", "yrs", "year") across multiple languages. Currently only available for English, Swedish, Norwegian, Danish, and Finnish
Double bra sizes are split, normalised, and combined as long as the band length (if any) appears before the cup size, such as `90C/D` to `90C` and `90D`
Splitting of multiple sizes provided in different formats, such as `M=44/46`, and size pairs separated with space, such as `44 46`
Common shoe sizes with fractions are combined and transformed into using a fraction character, such as `40 1/3` becomes `40⅓`
- Now supporting common fraction characters with denominators up to 8, to handle hat size sorting
Fractions vs. multi-dimension separation
- "3/4" is treated as a fraction
- "31/32" is treated as multiple dimensions and sorted accordingly
Ranges are sorted correctly (e.g., "5", "6-7", "8" follows a logical order)
Range indications such as "<2" and "8+" are sorted correctly
UK shoe size format ensures correct order: "UK 3 INF", "UK 4 INF", "UK 12 Junior" sort logically
Automatic separation of formats
Ensures logical sorting of sizes like "50 ml" before "100 ml", "Size 5" before "Size 10", "90 cm" before "100 cm"
Ensures variants within a product maintain a consistent order.

When combining values, the selected representative value is determined by the most common value in the data feed.

Combine size types

The "Combine size types" feature is intended to toggle whether the size facet should use size types or treat all sizes as a single group. When enabled, it combines all sizes into one long list, disregarding the size types. For example, a jacket with size 40 and a pair of pants with size 40 will be combined into a single size 40 option in the size facet.

However, this setting should be used with care. If your product catalog contains items from very different categories that happen to share the same size values, combining size types can lead to confusing results. For example, if both paints and shoes use “size 40,” they would be merged into a single option in the size facet, even though they represent completely different things.

Product type extraction

A lot of effort is often put into product categorization and grouping. However, often the categorization lacks the necessary structure for it to be used in an optimal way.

One aspect of categorization that is especially important is the concept of product types. Product types are often lumped together with other types of concepts such as "Sale", "Knitted", or a combination of multiple product types such as "Hats & Scarves". The data often lacks the proper granularity when distinguishing between product types and collections in general. Furthermore a significant number of products are often miscategorised, despite much manual effort.

Through sophisticated data analysis, Elevate identifies and extracts the product type.

To acquire a high precision product type identification, several methods are used. One is applying built-in relational information about product types and concepts. An example of this is applying the knowledge of "T-shirt" being a common fit of a bra. Rather than categorizing a "T-shirt bra" as both "T-shirt" and a "bra" it will correctly only be categorized as a "bra" with a "T-shirt" fit.

Another method is utilizing built-in language and terminology knowledge. For example, a "Tee" can be identified as a "T-shirt".

Another example is the knowledge that English terms might very well appear in Swedish content, allowing weighted cross language product type identification from single attributes. The word "bra" meaning "good" in Swedish, could either be interpreted as something insignificant or as the actual product type "bra". Based on context and linguistic knowledge, Elevate can assess that a "bra T-shirt" in Swedish likely is a "T-shirt" while a "plunge bra" likely is a "bra".

Similar Items

Finding similar items is a fundamental task for an information retrieval system like an e-com platform. Based on current technology, two types of item similarity can be considered: functional and visual.

Functional similarity is achieved using concept extraction. In effect, it is assumed that functional similarity is implied by the product type.

Visual similarity uses product images as input. Images are transformed into a metric space in which visually similar items are close, and dissimilar items are far away from each other.

To find similar items in the alternatives recommendation function, both techniques are merged to make sure that items are similar in both respects.

Article last updated 31 March 2026 11:19

Voyado Elevate

Search