The Hidden Costs of Unnecessary Attributes
In the increasingly competitive landscape of e-commerce and retail, product attribute management plays a pivotal role in defining the success of product taxonomies. One of the key dilemmas that data model owners and category schema developers face is understanding when to add more to their attribution model. However, these users often neglect the hidden costs in adding new attribution: The cost of remediating the existing data, and the cost of collecting that data on the next product.
These costs are not a per-attribute fee. They generally fall into the following categories:
- Delays in products reaching a market
- Products dropping from listings due to the missing data requirements
- The added cost of resources to find the missing data
- Vendor concerns about changing attribution
- Incorrect data entered in the field causes returns due to the inputter of the field just stuffing any data point in to bypass the requirement
These costs may be minimal for small businesses or limited hierarchies/assortments. However, as an organization grows in complexity the number of changes that can occur multiplies rapidly, as new assortments are sourced, new channels and regions are entered, and data complexity grows. The days of capturing every attribute that you possibly can don’t fit the reality of the current industry-wide data collection mess every business has today, so our intent in this post is to give you three simple rules to say when to add an attribute, and more importantly when NOT to add an attribute.
Rule #1 – It Must Be Collectible
This might seem self-evident, but you will be surprised how often the attribute needed just isn’t available. I managed an electronics taxonomy and made “Contrast Ratio” a required attribute on all LCD TVs, as contrast ratio was a valid attribute on any LCD screen at the time. One of our larger suppliers refused to fill the data in on TVs below a specific size. Their reasoning was a crucial learning point for me: They didn’t publish that data because, below a certain screen size, the contrast ratio made no meaningful difference in the quality of the TV output. Their internal policy was to never publish this data point, as they felt their competitors would inflate their numbers and make the screens on the TVs they sold look uncompetitive. They said if we didn’t remove the attribute they would respond with “0” every time on all tvs. Fighting over the validity of that attribute wasn’t worth the issues it was creating for a company that could never provide us that data anyway was pointless, and I changed my decision to match what was collectible, not just what was applicable.
Attributes that are uncollectible but mandated as required slow down data collection, hurting speed to market. The attribute should only be added when it should be easily apparent that the attribute can apply to the majority of the products in that category and there is a likely data source for someone to complete that data.
Rule #2 – It Must Be Enforceable
I’m going out on a limb on this one because I know there are multiple schools of thought. However, if an attribute cannot be mandatory in my opinion you should not add it to your schema.
*Waits for the yelling to stop*
Let’s be real for a minute………… If you are in data collection and you are presented 100 attributes, 50 of which are mandatory….. You are filling in 50 attributes. Optional attributes do not slow down the process because they can be easily skipped. Yes, someone might fill in a couple of optional attributes, but nobody is looking through tech sheets and documentation for an attribute they don’t HAVE to fill out. All that you have accomplished is put more attributes up on a screen to cloud the 50 or so attributes that they will actually end up filling out. Those attributes aren’t just zero-value… They make UI’s look more clouded and force people to take longer to collect the actual data your requirements say you need.
I agree that optional-but-conditionally-required attributes are a solution to this problem. These are different than optional attributes because they drive a requirement that may only apply to part of the assortment in a category without having to split the category to capture that data as a required attribute. Building categories just for the sake of building categories is no better than a ton of optional attributes. If you don’t have conditionally-required attribute functionality it may be time to split categories, but that is for a different post.
So ditch the optional attributes and monitor the data collection process. It will be faster, the UI cleaner, and the submitters of the data happier.
Rule #3 – It Must Be Valuable
There was a time when defining a category down to the smallest detail seemed like the only approach to data collection. If you collected every conceivable data point you could convince shoppers to buy from your site as a complete record of the item. We were capturing button-hole sizes on shirts and color on junction boxes that went inside walls never to be seen again after being covered by a wall plate. We were wreckless…… But powerful.
Reality kicks in eventually when you realize that a growing portion of the attribution provided is relegated to below the below-the-fold content. No, that is not a typo. The A+ content section on Amazon appears before the Product Details section on their page template, which is the place where the majority of category-specific attributes live. Heatmaps using eye-tracking technology show this section of the page is not well visited, and only half the attributes are actually looked at by most users. Adding attribute N+1 to that section, with the collection and remediation costs involved in adding that attribute, better be worth the effort.
There are generally only 3 ways an attribute is valuable in product data:
- It helps administer a product through the data collection process
- It helps a user in the data collection process find and/or complete a record with that process
- It helps a consumer find or purchase that product
It’s that simple, and it applies to ANY attribute in the product data domain. If it doesn’t help you complete the data collection process, administer the data collection process, or the customer find and buy the product, that attribute isn’t valuable. If it isn’t valuable, it has no return on investment.
Extra Attributes Add Friction to the Data Collection Process
Every time you put an irrelevant or uncollectible attribute in front you create friction in the process. That friction slows down data collection, but also incentivizes finding ways to fill out the field without providing the expected data. This is data collection fatigue, and it eats at data quality fairly quickly in the process. The more attributes a user sees in front of them, the faster that fatigue sets in. Users that are entering hundreds of records a week will find every trick in the book to avoid filling out the attributes, including attributes they should.
Creating data fatigue in your users through excess attribution is not a made-up way to scare you into using fewer attributes. I have been involved with enough data collection process redesigns to see data fatigue in real-time. This, along with bad UI design that places all the attributes on a single tab, will have users frustrated, exhausted, and looking for shortcuts in very little time.
If you want to see if you have a data collection fatigue issue with your users, check your approvals process. If you are rejecting over 10% of the records coming in on their first pass, you probably have a data collection fatigue problem. At 20%, this becomes definite. Don’t have an approval process? All the bad data created by this data collection fatigue problem is going straight to your website, your channel partners, and potentially your social commerce and print applications.
These extra attributes that don’t add specific value aren’t just empty holes in data. They slow down the data collection process and provide opportunities for users to submit bad data. Imagine you are completing a data collection for 100 SKUs, and the fifth attribute you find is something you can’t fill out. So you make up data to fill that field because you have 99 other SKUs to go. Then the 20th attribute is irrelevant, whether it doesn’t apply to the category or you just can’t find the data to supply. So you start putting in N/A or not applicable. Sometimes you put a period because the field will accept it.
In Summary…
There is more to designing your attribute schema than throwing attributes onto an Excel template and handing that template to users to complete. It takes good product taxonomy principles, proper hierarchy design, and taxonomy governance to ensure change is appropriate. But not all businesses have the budgets to afford this overhead.
So remember the value rules for attribution: If the attribute doesn’t allow you to administer a record, improve the data collection process, or help a customer find and/or buy a product, ask yourself why you have that attribute in your schema. Look at the data you’ve already collected and ask if it’s consistent, complete, and accurate. If you can’t find value in that attribute and the data quality is poor, consider deleting that attribute.