‹ Back to White papers

Benchmarking’s Bold New Future


Benchmarking has always been the most challenging aspect of performing a transfer pricing analysis. Conducting a comparable search requires one to make personal choices. Both quantitative and qualitative criteria are used to include or reject potential comparables. The Organisation for Economic Cooperation and Development (OECD) has recognized that these choices will influence the outcome of transfer pricing analyses.

The problem is that the selection of sufficiently comparable companies requires economists to make subjective judgments regarding functional comparability.

As such, over the years, many policy-making bodies, including the OECD, have published papers about how to improve benchmarking, with the ultimate goal to reduce subjectivity and relieve the burden, financial and otherwise, of establishing a reliable arm’s-length range.

One such policy-making body, the Platform for Collaboration on Tax (PCT), a joint initiative of the International Monetary Fund (IMF), OECD, United Nations (UN), and World Bank Group (WBG), has developed a toolkit to help developing countries understand the practical application of comparable data for transfer pricing. The European Commission also addressed the challenges of benchmarking in a paper presenting the perspectives of EU Joint Transfer Pricing Forum (JTPF) members.

Transfer pricing practitioners have traditionally maintained that third-party companies that are perfectly comparable across all comparability dimensions do not exist, and thus, subjectivity is a necessary evil in the benchmarking process.

However, we challenge that notion.

Thanks to advanced technology and the evolution of articial intelligence, we can conduct comparable searches based entirely on facts and circumstances— devoid of judgment calls—and increase the reliability of the arm’s-length range.

Traditional Approaches

The traditional approach to comparable searches has been to identify comparable companies through a systematic approach, which limits subjectivity.

According to the PCT and generally accepted transfer pricing practices, the minimum requirement for the application of the arm’s-length principle rests on two factors:

  1. A third-party comparable company must have available financial data.
  2. A controlled entity is not useful in evaluating arm’s-length profitability.

The following table shows the number of countries that have at least a pool of financial data in select Standard & Poor’s and Dun & Bradstreet databases1:

#of Countries Available #of Independent Records Revenue/Net Margin Information
11 100–1,000
21 1,000–10,000
13 10,000–100,000
5 >= 100,000

Using a conservative approach, 50 countries have more than 100 companies that are potentially useful in a transfer pricing analysis.

If a jurisdiction has at least 100 companies that are sufficiently independent with financial data, the next step would be to stratify those companies across functions for transfer pricing purposes.

To do so, practitioners have historically relied on industrial classification codes.

1 Standard & Poor’s for public company data, and Dun & Bradstreet for private company data.

Problems with Classification Systems

Classification systems, like the Occupational Safety and Health Administration (OSHA) Standard Industrial Classification (SIC) system, or the European equivalent, the Nomenclature des Activités Économiques dans la Communauté Européenne (NACE), were not designed for transfer pricing purposes and present limitations in preparing comparable searches.

In fact, Joint Transfer Pricing Forum members in the paper JTPF/009/2016/EN, state:

Industry codes (like SIC or NACE) do not often allow for a reliable selection of companies in the same industry. It is recommendable to concentrate more on a comprehensive selection and combination of precise keywords rather than a narrow selection of industry codes when defining search strategies.

The JTPF members also note that industry classification codes, themselves, are subjective and inconsistent, as functionally comparable companies may not be classified under the same industry code across dierent countries. Companies can choose their own industry-

classification codes, and given evolving business models, many companies do not fit nicely into one specific classification.

Practically speaking, classification codes are necessary to reduce the number of companies that require more qualitative evaluation for functional comparability.

However, if the classification system aligned functions across an industry, or was even representative of today’s business models, would transfer pricing practitioners have to be as discerning about the selection of companies across the numerous comparability requirements, as outlined by the OECD?

What is Comparable?

The evaluation of companies for benchmarking purposes, per OECD guidance, is highly dependent on the quality of the information available.

Unfortunately, even a change in the stratification of data does not change the level of detail available. Thus, practitioners and policy-making bodies, such as the JTPF, challenge the OECD on the feasibility of meeting comparability standards, claiming they are “illusory.”

Practically speaking, the minimum criteria for identifying good benchmarks include:

  • Availability of financial data
  • Independence
  • Overcoming barriers to entry2
  • Functional comparability

Of the four minimum criteria, the evaluation of functional comparability is the most subjective filter. Transfer pricing searches have historically involved reading the business descriptions of each potentially comparable company for functional comparability.

During my first few years in transfer pricing, I remember hearing Dr. Ednaldo Silva, CEO and founder of RoyaltyStat, speak about empirical evidence that supports consistent arm’s-length ranges for distributors across dierent industries.

At the time, many professionals disregarded this perspective because it challenged the value that transfer pricing professionals brought to the table—the ability to identify functionally good comparables in the same industry.

But soon accounting firms began to relegate the mundane task of reviewing comparability to junior analysts or establish offices in low-cost jurisdictions, while other companies tried to create “o the shelf” benchmarks since the reliance on the industrial classification codes continued to be unreliable.

2 For purposes of identifying companies that have overcome the barriers to entry, we can typically apply a sales/revenue threshold. The JTPF members recommend companies with operating revenue of greater than € 2 million (JTPF/009/2016/EN, page 24) while the PCT applies a turnover screen of at least € 5 million (PCT, page 103).

Combined with the inherent subjectivity introduced in the search process, comparables have become a major subject of debate between taxpayers and tax authorities.

Fiona and the Future

Historically, time, money, and manpower were limiting

factors in identifying comparables.

Now, finding satisfactory third-party comparable companies with objectivity is not only possible—it’s already happening. Artificial intelligence is changing how we approach benchmarking.

AI-powered Fiona, for instance, evaluates companies through a transfer pricing lens and identifies the most reliable, functionally comparable companies, regardless of the classification-code construct.

Comparability searches of the past required so many resources that we settled for the identification of five-to-ten broadly comparable companies.

Fiona can identify more comparable companies, reduce

subjectivity, and create more meaningful statistical samples.

In fact, the PCT paper states:

While typical screening processes rely on factors such as industry classification codes as   a practical means of refining a search, the extent to which such a code or other screening criterion is aligned with the economically relevant characteristics of the accurately

delineated transaction needs to be considered.

Benefits of Larger Sample Sizes

An increased number of comparable companies, without the subjective constraints of the classification-code construct, helps to create more statistically significant ranges that are less sensitive to individual company challenges.

It also eliminates some of the errors that are introduced through subjective screening on multiple levels of the analysis.

The use of an interquartile range is not meaningful without an appropriate sample size. Our approach to the interquartile range as a statistical approach is best supported in an example presented the PCT toolkit:

Our ability to identify more comparables based on functional comparability through a more objective process is better suited to the application of statistical approaches identified by the various policy-making bodies and economists.

Interestingly, JTPF members suggested in their report that the final outcome of a qualitatively reviewed benchmark should be in line with a “rough data dump within the database” and “big deviations” may lead to “doubts as to the reliability of the benchmark.”

In fact, the report goes on to state that too many rejected comparables should be evaluated more closely and the process of manual screening is referred to as “cherry picking.”

With Fiona, taxpayers can avoid the accusation that they hand-selected comparables because Fiona has identified functionally comparable companies objectively, going well beyond those antiquated and subjective classification systems.

 Overall Benefits of AI

If tax authorities are going to identify better comparables, then the burden of proof is on them. Fiona can produce a much larger set, which should (with the exception of database limitations) produce all the companies that a tax authority would determine as functionally comparable.

In fact, CrossBorder Solutions anticipates that comparable searches that continue to be prepared using traditional approaches will become easy to challenge under audit, as more third-party comparables are identified outside of typical classification codes.

Fiona facilitates a more robust way of reviewing millions of potentially comparable companies that meet the minimum comparability requirements.