Sing Compliance - Role Of Analytic Techniques

  • Uploaded by: Stuart Hamilton
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Sing Compliance - Role Of Analytic Techniques as PDF for free.

More details

  • Words: 14,170
  • Pages: 39
__________________

Optimising Compliance The role of analytic techniques __________________ Decision Tree audit-csv.txt $ Adjusted

Marital

Education

2 0 755 cases 93.8%

Occupation

12 0 322 cases 73.6%

7 1 219 cases 74.4%

Hours < - > 38

Age < - > 33.5

26 0 21 cases 71.4%

54

55

0 1 22 cases 61 cases 63.6% 73.8%

Rattle 2006-10-02 16:23:45 Stuart

__________________ Abstract The use of quantitative analytical techniques has an important and growing role in the optimisation of revenue authority client compliance. This paper covers some of the issues associated with the application of analytic techniques in a revenue authority business setting. The views expressed in this paper are those of the author and do not necessarily reflect the considered views of my colleagues at the Australian Taxation Office on any matter. Stuart Hamilton Assistant Commissioner Corporate Intelligence & Risk Australian Taxation Office 2 Constitution Avenue Canberra ACT 2600 Page: 1

Date: Thursday, 3 November 2006

Table of Contents Title & abstract ...................................................................................................................1 Overview .............................................................................................................................3 Background The ATO intent and business model ....................................................................................3 Analytics and the personalisation of interactions .................................................................5 Scene setting Non-compliance ...................................................................................................................6 Strategic Risk.......................................................................................................................7 A consistent measuring framework for Revenue Risk..........................................................9 Optimisation Treatments available (Enhance treatments) ......................................................................17 Assignment of clients to those treatments (Enhance candidate selection process) ...........20 Case mix (Enhance case mix) ...........................................................................................26 Other matters Prioritising analytic work.....................................................................................................29 Measuring strikes ...............................................................................................................30 The effect of varying strike rates ........................................................................................30 Traditional versus analytic driven case selection – an example .........................................33 Optimising capability ..........................................................................................................34 Conclusions .....................................................................................................................34 Annexes ............................................................................................................................35 Scoring using ∆Tax ............................................................................................................35 Some additional Rattle analysis output ..............................................................................36 Spreadsheets used in this paper........................................................................................37 Summary of some analytic methodologies that might be used in optimisation ..................38

Page: 2

Date: Thursday, 3 November 2006

Optimising Compliance – the role of analytic techniques Overview The use of quantitative analytical techniques has an important and growing role in the optimisation of revenue authority client compliance. This paper covers some of the issues associated with the use of analytic techniques in a revenue authority business setting.

Background – The ATO intent and business model The ATO’s business intent is to optimise voluntary compliance and make payments under the law in a way that 1 builds community confidence . It is similar to the underlying intent or mission of many, if not most, other revenue authorities. The ATO Business Model is premised on clients self assessing their tax obligations, with the ATO providing education and assistance, making it easier to comply, and verifying that compliance is occurring using appropriate risk management approaches. Education & Advice

Change Program

Compliance Program

Community Relationship Model, Compliance Model & Taxpayers’ Charter

Brand Navigator: > Trusted authority > Professional advisor > Fair administrator > Firm enforcer

The ATO seeks to optimise voluntary compliance by: o The provision of education and advice, including rulings, alerts and self help materials helps clients and their advisers on understanding the ATO's view of how the law might apply to their facts and circumstances. o Making a compliant client’s tax experience easier, cheaper and more personalised via the ATO Change Program. (Non-compliant clients may find it becoming harder and more expensive to not comply!) o Addressing non compliance in an appropriate manner. The Compliance program sets out the ATO view of compliance risks posed by some clients and how the ATO plans to address them in the coming year. Addressing non-compliance in an appropriate manner requires use of the ATO the Taxpayers’ Charter principles regarding taxpayer rights and obligations; the Compliance Model view of treatment selection and escalation, and the Brand Navigator on the appropriate ATO persona to present. 2

These views are brought together in the ATO Community Relationship Model :

1 2

See http://www.ato.gov.au/content/downloads/ARL_77317_Strategic_Statement_booklet.pdf See http://www.ato.gov.au/docs/CommunityRelationshipModel.doc Page: 3

Date: Thursday, 3 November 2006

Effectively, in a manner consistent with the intent of the Taxpayers’ Charter, the ATO aims to present an appropriate ATO persona to deliver, through the appropriate communication channel, a relatively tailored compliance strategy that optimises the long term compliance of the client. Compliance Model Have decided not to comply Don’t want to comply

Create pressure down

Brand Persona Use the full force of the law Deter by detection

Firm enforcer

Firm enforcer Fair Administrator

Try to, but don’t always succeed

Willing to do the right thing

Assist to comply

Attitude to Compliance

Compliance Strategy

Professional advisor Trusted authority

Make it easy

Fair Administrator

The “Easier, Cheaper and More Personalised” change program sets out eight key principles to guide the design and development of ATO products and services. Two of the guiding principles link to the services that are enabled or facilitated by analytic methods. These two principles3 are: …  Guiding Principle 04: You will receive notices and forms that make sense in your terms and that reflect your personal dealings with the revenue system. …  Guiding Principle 08: You will experience compliance action which takes into account your compliance behaviour, personal circumstances and level of risk in the system.

3

See http://www.ato.gov.au/content/downloads/Making_it_easier_to_comply_2005_06.pdf

Page: 4

Date: Thursday, 3 November 2006

Both of these principles posit that the ATO’s interactions with the client will be more personalised – that is the notices, forms and compliance action4 will better reflect the client’s circumstances, behaviours and risk profiles relative to other clients.

Background - Analytics and the personalisation of interactions To provide more personalised interactions, the ATO needs to be able to identify within its client base relevant differences between clients. That is those differences that matter. This is the essence of analytics, being able to examine large data holdings to identify those attributes that correlate to particular client circumstances, behaviours and risk profiles. Those attributes can then be used to enable the ATO to better tailor its client interactions. That is to make the interaction more relevant to the client – more personalised, more likely to succeed. This examination of client attributes to identify those that correlate to particular client circumstances, behaviours and risk profiles is an area of analytics known as ‘data mining’ or knowledge discovery and it is this approach that heralds an improved way of optimising client compliance. Before delving into these analytic approaches some scene setting is needed to establish a consistent base of terminology and understanding for this paper. Scene setting - Optimisation Many, if not most, revenue authorities aspire to “optimise client compliance” with taxation laws, but what does “optimise” really mean and what role might advanced analytic techniques play? Economic theory (and common sense!) tells us that there is an optimum point in a revenue system at which the net revenue of the system would be maximised (Os) for given tax rates and rules. Beyond that point the marginal cost exceeds marginal revenue – it costs more than a dollar to get a dollar in. In most economies the revenue authority would operate significantly below this point (O1) reflecting lower budgetary needs of the Governments, administration resource constraints and accepted community attitudes regarding the nature and level of intrusion of the revenue authority into the economy.

$

Theoretical “Full Compliance” bandwidth Total Revenue

Total Cost

Resource Constrained Optimum

O

Theoretical System Optimum

O

In these circumstances1 “optimisation” takes on as different practical meaning.

Net Revenue

Cost

The challenge for the revenue authority is to optimise long term net revenue outcomes within its resource and other operating constraints – ie to maximise long term voluntary compliance given relatively fixed resources. 4

Compliance action in accordance with the ATO Compliance Model and the Taxpayers’ Charter. “The compliance model directs that we better understand why people are not complying and that we develop appropriate and proportionate responses. An underlying objective is to develop responses that maximise the proportion of the community who are both able to, and choose to, comply.” Depending on the reasons for non-compliance, responses can be aimed at: enabling compliance through education, assistance or making it simpler to comply, or enforcing compliance through via administrative and prosecution action. Page: 5

Date: Thursday, 3 November 2006

This paper focuses on this view of ‘resource constrained optimisation’ and how it might be more objectively achieved by revenue authorities. The analysis is largely restricted to quantifiable aspects of revenue risk and more subjective aspects such as reputation risk are not overtly factored in. In order to optimise long term voluntary compliance it is suggested that number of aspects need to occur: o

The number of clients requiring relatively expensive remedial compliance action should be minimised (ie clients should know how to comply [education], be able to comply [the interaction of client and the revenue authorities systems needs to 'fit'] and be ready to comply [attitudinal]).

o

The right range of remedial compliance treatments should be available. Any remedial compliance action should maximise the long term compliance gain within the client base (ie the treatment should be most appropriate to engender voluntary compliance overall – a leverage compliance model viewpoint).

o

The right clients should be selected for the appropriate remedial compliance action (ie the strike rate for a particular remedial compliance treatment should be maximised).

o

The right mix of discretionary compliance work needs to be achieved overall so that long term revenues are optimised within the resource and capability constraints faced by the organisation.

Revenue authorities generally have a range of administrative treatments: education, assistance, review and enforcement products that they can use to address non-compliance, in addition to generally longer term system and legislative/policy changes. At present optimisation in tax administration is generally based on the expert views of experienced senior officers. Equally valid, but different views might also be held and hence the outcome is one of subjective optimisation. Colloquially it might be put as ‘informed gut feel’ of the right balance - a judgment call. It is suggested in this paper that these judgment based optimisation approaches may be better informed and often enhanced by the use of objective business decision support approaches that assist in the determination of the ‘right’ products, the ‘right mix’ of products and the ‘right’ clients to apply those treatments to. However these objective approaches do require reasonably robust knowledge of the costs and benefits of various courses of action and this can be a significant problem. A greater level of consistency of approach and 5 understanding is also needed so that the information used in business decision support is valid and verifiable .

Scene setting - Non-compliance To select the right clients for the right treatment we need to have a consistent view of what non-compliance is so that we can form an objective view on the relativities of various aspects of client non-compliance. While there are many aspects to compliance with revenue laws and regulations (see the ATO Compliance 6 Program) client obligations can generally be thought of in the following broad manner : o

Registering in the system (either with the revenue authority or with some other body)

o

Lodging or filing the appropriate forms on time

o

Providing accurate information on those forms

o

Making any transfers or payments due on time

Most revenue systems also require a client to maintain records of appropriate information for some set period. Ie o

5 6

Keeping records that allow verification of the information used to satisfy the above obligations.

If such information is not available then simulation modelling can still assist in determining key directions and sensitivity aspects. See page 7 of the OECD 2004 document: Compliance Risk Management @ http://www.oecd.org/dataoecd/44/19/33818656.pdf Page: 6

Date: Thursday, 3 November 2006

Scene setting - Strategic Risk At a macro compliance level we could consider revenue risk to be total tax not collected due to non-compliance with the above broad obligations. The US IRS for example has produced a 'Tax Gap' analysis based on this revenue risk framework. (Note that the IRS have non-registration is included in non-filing.) The IRS Tax Compliance Measurement Program (now known as the National Research Program -NRP) is primarily based on a large scale periodic stratified random audit approach and indicates that: o

~ 10% of non-compliance is related to non registration and non lodgement;

o

~ 80% to under reporting of taxable income; and

o

~10% to non payment of debts due.

With an overall non-compliance rate in the USA of ~15% of the theoretical tax believed due. Simplistically, if the voluntary compliance cost and response was equal between these types of non-compliance then the compliance resource allocation should broadly match the above splits. This simplistic view does not take into account that resource intensity and compliance responsiveness will in practice differ significantly depending on the nature, causes and extent of the non-compliance and the available treatments for it. (More on this later.) Such a macro or high level view is of limited practical use in understanding and addressing non-compliance.

Source: http://www.irs.gov/pub/irs-news/tax_gap_figures.pdf

7

Moreover, while superficially compliance may seem a strict matter of fact – yes or no - in practice it is often far 8 more blurred or ‘grey’, a question of interpretation and judgement, than may appear to a layman . Uncertainties in definitions, interpretation and measurement compound to render views on what full compliance is as a relatively broad bandwidth with wide confidence levels. Movements in total compliance are very difficult to ascertain with a high degree of confidence and are necessarily dated with this periodic sampling approach. 7

Note - the figures for underreporting are more subjective as they are inflated (by a factor between 2 and 3) to take into account income not detected by audit processes. See for example the US Tax Inspector General for Tax Administrations April 2006 report @ http://www.ustreas.gov/tigta/auditreports/2006reports/200650077fr.pdf#search=%22US%20IRS%20Tax%20Gap%20Estimate%20Inspector%2 0General%22 8 See the OECD 1999 document: Compliance Measurement at page 4 @ http://www.oecd.org/dataoecd/36/1/1908448.pdf Page: 7

Date: Thursday, 3 November 2006

In these circumstances concepts such as tax gap become significantly less precise and useful in objectively guiding compliance activities – the what and who to review. The OECD practice note on performance measurement in tax administration sums up the tax gap question at paragraph 224: “To sum up, the general position on measuring the tax gap is that it is difficult if not downright impossible and even if it were possible to get a reliable total figure it would not tell us much of practical value in the struggle against noncompliance.” That does not mean that a random audit program such as that use by the US IRS NRP does not have its place in the optimising compliance tool kit. It just needs to be used appropriately – when knowledge of the risks of a client group is relatively low so that the low strike rate, and hence high rate of intrusion into compliant clients affairs, of a significant random case selection approach can be justified overall in terms. Broadly speaking when optimising compliance I would suggest that any large scale random audit program be restricted to those situations where the true informational value and deterrence effects of the random process outweigh the significant social and opportunity costs it imposes over targeted detection based approaches. Ie The few areas where the organisation is flying blind. Eg Where new legislation has been introduced and the discriminate features of non-compliance are largely unknown and can’t be reasonably estimated. So generally we may conclude that: o Random / coverage based approaches are useful for threat discovery where we cannot effectively target clients on a known risk. They can also be used to update views on risk relativities. o

Risk based approaches are appropriate where we can target clients by way of known risks. (Note that we can still discover new risks to the extent they occur, and are detected, in the clients reviewed.)

This discovery/ detection continuum can be represented in the following model: Discovery v Detection – a risk knowledge continuum Detection of known risks/issues Risk Targeted Audits Predictive Data Mining (Scoring Approaches)

Discovery of new threats/risks Stratified Random Audit Descriptive Data Mining (Clustering Approaches) Low

High

Knowledge of risk in client population If you have a robust and effective measure of client risks and how to detect them then discovery processes can be kept relatively smaller than the situation where you have little or no knowledge of client risks. Discovery processes often have a low strike rate – the value is generally in the insight into better detection approaches. Note: The volume of cases needed to discover relative differences in risk (and so tune selection approaches) is significantly lower than the volume of cases needed for a statistically accurate estimate of the overall tax gap.

Most leading OECD revenue authorities have a reasonably robust knowledge of their client base and the broad strategic risks do not generally change much year to year. Overall most of their clients pay most of their tax most of the time. The ATO's compliance program http://www.ato.gov.au/content/downloads/ARL_77362_n7769-82006_w.pdf details the ATO view of the compliance risks and issues facing the organisation, the treatments being applied and the results. So if we don't use a tax gap analysis to inform the relativities of strategic and operational intelligence (the what and who to treat) is there a consistent, objective framework that can be applied across products and obligations?

Page: 8

Date: Thursday, 3 November 2006

Scene setting - A consistent measuring framework for Revenue Risk To objectively optimise compliance across the obligations we need relatively common approach to measuring noncompliance – otherwise our prioritisation between types of non-compliance will be necessarily subjective. If one area prioritises cases on one basis and other area uses a very different measuring stick then the overall view of the relativities of revenue risk will still be based on personal, albeit informed, judgements of the matter. Without an objective mechanism our legal 'choice of remedy' may be more open to criticism and question over concerns of bias, subjectivity and inconsistency. (Why did we select case X over case Y or risk A over risk B?) Broadly speaking the following conjectures might be stated regarding the requirements of a revenue risk measurement approach: o

For the client obligations of registration, on time lodgment, accurate reporting and correct accounting/payment, client revenue risk should be consistently viewed where this is practical.

o

Client revenue risk ranking should be done as objectively as possible, taking into account aspects such as the revenue at risk, the severity of non-compliance and our confidence level in the revenue at risk.

o

Client revenue risk ranking should be flexible enough to cater for aspects such as losses, recidivism, schemes and associated client linkages.

o

Client revenue risk ranking should be highly scalable and stable – from lists of a few clients to potentially scoring millions of clients.

For the client obligations of registration, lodgment, accurate reporting and correct accounting/payment, I suggest that revenue risk can generally be distilled into the following four features for the purposes of comparable revenue risk ranking across and within obligations and products: o

∆Tax Delta tax - The change in primary tax associated with the non-compliance. [ie Identifies those who may have the most tax wrong. An absolute amount. “Client A may have underpaid $5,500 in tax in year y.”]

o

∆Tax/(∆Tax + Tax) Severity - The relative severity of the non-compliance as a percentage of tax paid. [ie Identifies those who may have most of their tax wrong. A relative value. “Client A may have underpaid 15% of their tax in year y.”]

o

Cf(∆Tax) Confidence - The confidence interval associated with our estimate of ∆Tax. [ie Identifies how confident we are of the estimate in ∆Tax. “We are 90% confident that Client A underpaid $5,500 in tax in year y +/- $550”]

o

Pf(∆Tax) Proportion collectable - The proportion of ∆Tax estimated to be collectable. [A function of a clients’ propensity to pay and their capacity to pay. “We estimate that 80% of the $5,500 estimated to be underpaid by Client A will be collectable +/- 1,000.”]

Let us look at how this revenue risk concept might work for the key client obligations previously identified: o

Registration For registration non-compliance ∆Tax is the estimate of the amount of net revenue that is predicted to be derived by achieving registration (and subsequent lodgement) for those clients not registered in the system. (Detection is generally via matching processes, comparing a list of names against registered clients to detect those operating outside the system.)

o

Lodgment For non Lodgers: ∆Tax is the estimate of the amount of net revenue (ie taking into account withholding and instalments: PAYGW & PAYGI etc) that is predicted to be derived by achieving lodgment from a treatment for those clients who otherwise would not lodge. Late Lodgers: For those clients who would lodge late ∆Tax is really the time value of money (PV) brought forward by achieving earlier lodgment following a treatment. Most lodgment clients by number would be late lodgers rather than non lodgers. The distinction between the two classes will overlap and an informed decision is needed for the transition from one class to the other.. Eg After 3 months a late lodger 'transitions' into a non lodger.

Page: 9

Date: Thursday, 3 November 2006

o

Reporting ∆Tax is the estimated primary tax associated with the incorrect reporting of a value on a lodged form. Most underpayment of tax is due to incorrect reporting. Unlike the more evident registration, lodgment and payment compliance aspects (eg Client A did not register/lodge/pay), incorrect reporting is generally harder to detect without relevant third party data. This is of concern as in mature tax systems most non-compliance is likely to be associated with understatements of taxable income or turnover (or the over claiming of deductions or credits).

o

Correct accounting/Payment Non Payers: ∆Tax is the estimate of the amount of net revenue (ie taking into account PAYGW & PAYGI etc) that is predicted to be derived by achieving payment from a treatment for those clients who otherwise would not pay. Late payers: For those clients who would pay late ∆Tax is the time value of money (PV) brought forward by achieving earlier payment following a treatment. Most debt clients by number would be late payers rather than non payers. As with the late/non lodgers the two classes will overlap and an informed decision point is needed for the transition from one class to the other.

Having grouped our non-compliance using a common revenue measuring stick of ∆Tax, what are some of the salient aspects we would expect to see in the resultant distribution? Expected ∆Tax distribution where ∆Tax is the estimated primary tax involved in the non-compliance: o

Modal ~  0. ie most clients comply with their tax obligations (in accordance with the compliance model view.)

o

Smaller negative tail. ie clients, and their advisors, have more of an incentive to detect, or not to make, errors in this direction.

o

Longer positive tail following a truncated pareto or power distribution past $Y. ie most errors or omissions are relatively small while a few are very large. The fall-off is basically in accordance with the well known pareto distribution (a form of inverse power distribution).

o

Our ability to predict ∆Tax from data is limited and has a confidence interval associated with it. Our confidence will be greatest where we have significant experience and falls off as we move away from this.

o

Risk scores based on deviations from average ∆Tax would be relatively stable for large populations. Ie the Average overall tax underpaid would not vary significantly from year to year. n xx x x x

x x

x x x

x Confidence distribution in ∆Tax Estimate x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

x

x

x x x x

x

∆Tax →

$Y

Page: 10

Date: Thursday, 3 November 2006

A similar distribution would follow for severity: Theoretical ∆Tax/(Tax+ ∆Tax) Distribution Where ∆Tax/(Tax+ ∆Tax) is the estimated primary tax involved in the non-compliance divided by the total net tax payable:– This gives a indication of the relative ‘severity’ of the non-compliance taken as a proportion of the tax paid:  0 implies the amount of the error, avoidance or evasion is relatively insignificant compared to the tax the client has paid.  1 implies that the tax error, avoidance or evasion affected all or almost all of the tax otherwise payable. Suggested features, (as with the ∆Tax distribution): o

Modal ~ 0  most clients comply with their tax obligations

o

Smaller though longer negative tail  clients, and their advisors, have an incentive not to make errors in this direction, but an overpayment may exceed the total tax that would have been due.

o

Longer positive tail ~ truncated pareto or power distribution. n x x x

x x x x

x x

x x

x

x x

x

x

x x x x x

x x

x x

Confidence distribution in ∆Tax/(∆Tax+Tax) Estimate

x x

x x x x x x x x x x x x x x x x

x

x x

x x 0

x x ∆Tax/(∆Tax+Tax)  1

These factors – the absolute amount, the relative amount, our confidence in the estimate and our view of collectability are important in objectively prioritising our relative concept of risk so that we can optimise the selection of treatments and clients. A balance must be struck by the organisation so that case selection prioritisation is defendable and repeatable (so that our ranking of clients can be evaluated and improved.) This approach of estimating client errors also raises the aspect of identifying and dealing appropriately with overpayment or credit situations – both in an absolute and relative sense. To be a fair administrator the appropriate balance needs to be achieved in all things.

Page: 11

Date: Thursday, 3 November 2006

A ∆Tax estimate gives us a view as to who may have evaded or avoided the most tax in absolute terms – a critical factor for a revenue collection agency:

∆Tax: Who avoided or evaded the most tax…?

n

x x

xx x x x x x x x x x x x x x x x x x x x x x x x xx x x xx x xx x x x x x x x x x

Confidence distribution in ∆Tax Estimate

x x

x x

x

∆Tax →

$Y

While a ∆Tax/(∆Tax + Tax) estimate gives us a view as to who may have evaded or avoided most of their tax in relative terms – a critical factor for a revenue collection agency looking at serious non-compliance and aggressive tax planning.

Severity ∆Tax/(∆Tax + Tax): Who avoided or evaded most of their tax…?

n

x

x

x

x x

x x x x x x

xx x x x Confidence distribution in x x x x ∆Tax/(∆Tax+Tax) Estimate x x x x x x x x x x x x x xx x x x x x x x x x x x x x x 0

Page: 12

x

∆Tax/(∆Tax+Tax)  1

Date: Thursday, 3 November 2006

We can now use these concepts to bring together a view of the relativities of revenue risks and use this to prioritise our risks and candidate pool of clients for subsequent treatment: o

Ranking Ranking scores could be produced for both revenue, severity, confidence and collectability: A revenue score is produced by dividing the client ∆Tax by the average ∆Tax [ie ∆Tax/Ave∆Tax]. A severity score is produced by dividing the client severity [∆Tax/(∆Tax + Tax)] by the average severity. (If required high, medium and low risk classes could be constructed in regard to the distribution – high risk say the highest 10%, medium risk the next 20% and low risk the next 70%.)

o

Weighting revenue and severity A revenue focused score might take a 90/10 weighting of the revenue and severity. A severity focused score might take a 10/90 weighting of revenue and severity. Weighting approaches ensure that one dimension doesn’t completely dominate the case ranking process. It may be appropriate to allow weightings to be varied to best meet the focus of the case selection run.

o

Confidence levels & Collectability Cf(∆Tax) & Pf (∆Tax) could be used to discount ∆Tax if this was considered necessary.

This approach to revenue risk scoring can provide an objective relative ranking of tax risks and cater for issues such as recidivism, losses and the promotion or association of non-compliance by others. o

Tax risks or issue A relative ranking of a tax risk or issue could be derived by the difference in relative ∑ of the ∆Tax and severity of clients affected. (Note that this does not extend into a quasi tax-gap analysis as ∆Tax is not based on representative samples. ∆Tax will generally be derived from known cases and is thus biased.)

o

Recidivism & currency Use a multiyear score based on the sum (∑) the present value (PV) of ∆Tax over say a standard time horizon (say three years – Y1, Y2 & Y3). Severity scores could be weighted by say (1xY3+0.7xY2 + 0.3xY1)/2 to affect a reduced impact of earlier years compared to later years.

o

Losses For multi year scoring losses can be accommodated via the PV of the tax adjustment claw back over a standard time horizon (say three years).

o

Agents Agents could be looked in regard to both the ∑ of the ∆Tax of their clients and also from a relative severity approach. Relatively high scoring agents could be assigned for an appropriate treatment.

o

Scheme promoters & schemes Scheme promoters and schemes could be looked in regard to both the ∑ of the ∆Tax of the participants in the scheme and from a relative severity approach.

o

Industry, occupation and location Industry, occupation or location aspects could be looked in regard to both the ∑ of the ∆Tax of the clients and also from a relative severity approach. Relatively high scoring industries, occupations of locations could be assigned for an appropriate treatment.

o

Whole of Client The values of estimated ∆Tax can be summed for a client provided they derive from mutually exclusive risks or they are counted on a first point of error basis. (ie Work related expenditure ∆Tax and rental property ∆Tax and omitted income ∆Tax can be summed whereas ∆Tax for WRE and over-claimed expenses ∆Tax cannot (unless over-claimed expenses is constructed to exclude WRE claims)).

A whole of client / whole of product risk profile can then be constructed that places the clients and product revenue risk in relative relationship to other clients / products. In the ATO these broad client obligation risks are now expanded upon in the case management system down to return form/schedule label items. The rationale for this is that incorrect reporting ultimately relates to aspects not correctly reported at a label item. Our tax return database consists of these label item values by form by client by reporting period and is generally a critical source of discriminate information used in case selection processes. Page: 13

Date: Thursday, 3 November 2006

A client risk profile can be derived from revenue risk scores… Client Risk Scores can be supported at the transaction process level, case level, whole of product level and whole of client level. All scores can be organised in a logical hierarchy of risk with the ‘whole of client score’ at the highest level of aggregation and complexity*. Obligation -> Register

Lodge

Report/Advise

Account

Fully Com pliant

“Propensity to Register Correctly”

“Propensity to Lodge On-time”

“Propensity for Correct information”

“Propensity to Pay On-time & In full”

Incom e Tax

Risk Score

Risk Score

Risk Score

Risk Score

Risk Score

GST

Risk Score

Risk Score

Risk Score

Risk Score

Risk Score

Excise

Risk Score

Risk Score

Risk Score

Risk Score

Risk Score

Super

Risk Score

Risk Score

Risk Score

Risk Score

Risk Score

(other FBT etc)

Risk Score

Risk Score

Risk Score

Risk Score

Risk Score

Risk Score

Risk Score

Risk Score

Risk Score

Risk Score

Risk Attributes:

Risk Attributes:

Risk Attributes:

Administrative Product ↓

All Products (weighted scores)

“Propensity to Meet All Obligations” (weighted scores)

Assessment History Label Analysis Ratio Analysis Refunds/Liabilities

Lodgment History Timeliness Aging

Registration History Proof of Identity

Risk Attributes: Payment History Debt Level Timeliness/ Aging

Whole of Client Score

*NOTE: Client Scores can be further aggregated to support Industry, Occupation and Product Risk Scores for the w hole client population. This ability is critical to realising a more robust risk assessment process and linking corporate risk rating w ith case-based risk rating.

It should be noted that the summation of ∆Tax cannot be used to construct a tax gap view like that of the IRS:

Why ∑∆Tax cannot be used for Tax Gap estimates… The ∆Tax predictions are derived from the analysis of successful cases (strikes) and are not drawn from a representative sample of the population. At low ∆Tax, predictions the confidence interval would be relatively large compared to ∆Tax. Ie The uncertainty in the tax revenue at risk would exceed the estimate of the revenue involved – hence it cannot be used to project overall compliance levels accurately. n

Diagrammatically:

xx x x

x x

x x x

x

x

x

x x x x

x 0

x Confidence distribution in ∆Tax Estimate x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

∆Tax →

However ∆Tax predictions could be used to assist budget revenue at risk as they are based on past claw back.

It is suggested that a better view of revenue risk is the risk to budgeted revenue flows from non-compliance rather than total tax gap concepts. The sum (∑) of ∆Tax from successful cases (strikes) is the direct claw back from compliance activities included in the consolidated revenue received by Government and built into budget estimates.

Page: 14

Date: Thursday, 3 November 2006

Tax office revenue risks can be thought of as unanticipated movements in budgeted revenue from changes in underlying compliance levels whereas forecast errors are a Treasury risk. Estimates of ∆Tax and changes in ∆Tax could assist in the annual risk analysis process undertaken each year that is used to guide mitigation activities.

Revenue Consequence Distinguish betw een compliance behaviour changes and economic changes beyond our influence of control

Revenue

Actual

Forecast ∑∆Tax More revenue More revenue

Budgeted Revenue

v

∑∆Tax Actual claw back of tax

Predicted claw back of tax

Less revenue

Less revenue Predicted claw back of tax

Economic movements =

Compliance movements =

Treasury’s Risk

ATO Risk

Before turning to the use of analytic techniques for optimisation we should note that it is crucial that the qualitative information from our intelligence gathering activities be combined appropriately with the quantitative intelligence produced via the use of analytics. Bring both views together produces a better result than using just one. Intelligence views: Descriptor: “Aggressive Clients” – Those at the ‘pointy end’ of the compliance model whose actions and or influence are such as to require intensive and immediate action or monitoring. People who rort either the tax laws (avoidance) or our administration of them (SNC). Suppression of known promoters and crime families and the detection of new / emerging ones (needle in haystack work.) ‘1-1’ relationship once known. Client Profiles/Issue Profiles – Tailored treatment > Cuts across all markets/products. • Potential for some commonality of staffing & collection excluding technical.

Descriptor: “Key Clients” – Involves the gathering of intelligence for large public groups whose economic importance is such that the consequence of non compliance is very significant even if the likelihood is low. ‘1-1’ relationship. Client Profiles – Tailored treatment > LB&I top 200 client groups / GST ILEC top 200 / Top Excise payers / Large Super • Potential for common focus on key clients for all products. Common staff excluding technical.

Descriptor: “Complex Clients” – Involves gathering intelligence for reasonably large complex, often private groups often controlled by HWI’s. Assembling a complete picture can be difficult and involve International aspects (CFC’s). Individual cases can range from high likelihood/high consequence (active monitoring) to low likelihood/medium consequence (periodic monitoring). ‘1-Few’ relationship – Client Profiles/Issue Profiles – Mainly tailored treatment > SB SME / LB&I bottom end (Next 1300 client groups) / GST SME / HWI • Potential for common focus on high risk client groups.

Descriptor: “Mass Clients” – Generally involves the analysis of large volumes of data and the use of quantitative techniques and data matching to identify issues affecting significant numbers of clients. Individual cases are relatively speaking “high likelihood/low consequence” amenable to large scale mitigation programs. Industry/Occupation/Region type approaches. ‘1-Many’ relationship – Issue Profiles – Large treatment programs targeting segment issues > PTax/ SB Micro / GST Micro / Excise diesel credits • Potential for common analytic/data matching staff working with technical experts.

Page: 15

Date: Thursday, 3 November 2006

Optimisation – the art / science of making a system as effective as possible “It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts." Sherlock Holmes in ‘A Scandal in Bohemia’ (1891) “Data! Data! Data!" he cried impatiently. "I can't make bricks without clay." Sherlock Holmes in ‘The Adventure of the Copper Beeches’ (1892) The primacy of data driven approaches in optimisation Optimisation is (or should be) an objective data driven approach – based on evaluation and analysis of the data. While ideally the scientific methodology should be used (control groups, double blind approaches etc), in a business environment this may not always be possible with the degree of rigour normally associated with good science. That said the conjunction of computer power, statistical techniques and machine learning algorithms has now allowed the development and deployment of robust approaches that can withstand the relatively poor data quality/ second best data often associated with business systems. Having set the scene and established a view of what non-compliance is and how we might consistently measure revenue risk, we now look at how we determine the right treatment for the right client.

Identify strategic risks

Select risk to be treated

Define candidate population for risk

Enhance treatments

Select & rank candidate pool

Determine treatment options

Select & allocate candidates to treatments

Treat clients

Measure and analyse effectiveness

Enhance candidate selection process

We will then examine approaches for case mix optimisation – how does the mix of case types and numbers impact upon the revenue and can that be optimised.

Page: 16

Date: Thursday, 3 November 2006

Optimisation - Treatments available (Enhance treatments) It is a truism that if the only tool you have in your bag is a hammer then the solution to every problem starts to look like a nail. All you can optimise in such situations is how you hit the object. Systems where the only answer is a prosecution will tend to view the solution set to a compliance issue as prosecuting the right clients – even though a prosecution might not be the ‘right’ treatment to engender long term compliant behaviour. (Indeed a side effect can be that an enforcement culture permeates the organisation rather than a client service ethic that realises clients get it wrong for a wide variety of reasons.) Research on regulatory compliance models indicates that a range of treatments should be available to engender long term voluntary compliance. An escalatory model is suggested to create an incentive for the client to move towards a more engaged compliant behaviour set. This should take into account the facts and circumstances of the clients’ situation so as to treat the client in the most appropriate way: o

For example recidivist clients (those who repeatedly offend after treatment) would generally warrant a different treatment than a client detected making an error for the first time.

o

Similarly those who promote non-compliance by others generally warrant different treatment to those who don’t.

o

Those in special positions of trust and influence in the tax system (eg key intermediaries, revenue authority staff, lawyers and accountants) generally warrant different treatment to those who aren’t in such positions.

o

Those involved in avoidance schemes aren’t all the same. Clients with relatively low knowledge of the tax system who enter schemes on the advice of their trusted advisor should not be treated the same as those who would be reasonably expected to have good knowledge (such as the advisor).

The causal factors involved in the non-compliance also factor into the appropriateness of the choice of remedy. The causal factors for non-compliance could be as a result of: o

a difference of views – a reasonably arguable position that differs from the ATO view,

o

not being in a position to comply,

o

honest mistake,

o

ignorance,

o

carelessness,

o

negligence or

o

deliberate intent.

It is important that our 'choice of remedy' be appropriate and defendable – and that the mechanism to get to the decision on the remedy be evidence based and repeatable. These aspects can be brought together into an overall ‘model’ or framework to view non compliant behaviours based on the clients level of engagement with the regulatory system. As the NZ IRD version of the compliance model indicates (See diagram on next page): o

Some treatments will apply to many clients at once and be rather general in nature – such as education materials available to the public.

o

Other treatments will be client group specific, advice aimed at a particular industry or occupation or segment.

o

Finally some treatments are targeted at particular clients.

Page: 17

Date: Thursday, 3 November 2006

Compliance Models: New Zealand Tax Compliance Model

http://www.irs.gov/pub/irs-soi/04moori.pdf - paper by Tony Mossis & Michele Lonsdale NZ IRD “Translating the compliance model into practical reality” The New Zealand IRD model is a useful adaptation on the revenue authority compliance model presented in the 1998 ATO Cash Economy Task Force Report:

Improving Tax Compliance in the Cash Economy, Page 58 Second Report, ATO Cash Economy Task Force, 1998

Page: 18

Date: Thursday, 3 November 2006

These compliance models, or frameworks, posit an escalating set of remedies to observed client behaviours: o

For those trying and succeeding to do the right thing – the majority of clients – compliance is made as simple as possible. Information requirements are reduced and interactions are made as cheap and easy as is practical. (See for example the ATO Easier, Cheaper, More Personalised Change Program @ http://www.ato.gov.au/content/downloads/Making_it_easier_to_comply_2005_06.pdf )

o

For those trying, but not succeeding, in doing the right thing, education and advice is provided. This can be general, or aimed at specific client segment – an industry or occupation group or some other discernable client grouping. (See http://atogovau/corporate/content.asp?doc=/content/42628.htm on marketing and taxation. Here descriptive analytics to identify market segments can assist.)

o

Some clients may request assistance and advice and others may be targeted for a review whose outcome is advice eg a record keeping review. Here predictive analytics can assist.

These interventions are generally relatively low cost mechanisms for enhancing voluntary compliance for clients who are trying to do the right thing. o

A smaller number of clients will usually exist who for a variety of reasons appear to have carelessly, negligently or deliberately not complied. For these clients a common treatment is to audit the client to determine the amount of the non-compliance and the reasons and, if considered appropriate, penalise the client for not complying. The audit may be targeted at a specific issue or may be a more wide ranging examination of the whole of the client’s tax affairs.

o

For those few clients that have relatively serious/aggressive non-compliance and other aggravating factors, the treatment may be to investigate with a view to prosecution. Due to the legal evidence gathering nature of these cases they tend to relatively resource intensive and costly.

In order to objectively optimise compliance treatments for clients we need to capture data that reflects relevant client circumstances, the nature of the treatments used and the clients’ response to the intervention over time. Rather than sequentially changing our treatments over time we can evolve our optimum (champion) treatment via the use of controlled champion / challenger treatment groups where analytically similar clients are assigned to different (though still appropriate!) treatments in order to evaluate which treatment works best at engendering long term voluntary compliance for the relevant client segment. By creating control groups of clients assigned to different treatment pools, champion / challenger strategies can be evaluated and used to determine the best treatment for a client given their facts and circumstances as revealed in the data. To work best this needs to be a deliberate strategy with controlled data capture rather than an after the event thought when much of the required data is no longer available for analysis. Standard parametric (or non parametric if the control groups are small) statistical tests can be used to identify the optimum treatment from the options tried.

RETURN ON INVESTMENT

Champion / challenger analysis: Potential actions

Champion

Today

Challenger 1

Break even Current ROI trajectory

Challenger 2

KEY Champion treatment Challenger Treatment 1 Challenger Treatment 2 TIME

Page: 19

Date: Thursday, 3 November 2006

Optimisation - Assignment of clients to those treatments (Enhance candidate selection process) Analytic approaches have a key role to play in the appropriate segmentation and allocation of clients into different treatment pools. Ideally supervised learning approaches (predictive analytics) can be used to identify and assign clients into the most appropriate segment given their particular facts and circumstances as revealed in the data holdings by an analysis of discriminate features in the data. Descriptive analytic techniques (eg clustering) can help you to understand the circumstances of groups of clients that have difficulty in complying while supervised learning approaches 'mine' data on past interactions to discover knowledge (associations – patterns and trends) that allow us to better predict the response of a client to an interaction. Dataminer’s use statistical and machine learning algorithms squeeze the maximum informational value out of the dataset in question. Often the datasets have to be transformed and cleansed to get the most out of them. Some statistical techniques for example work best with normally distributed, homoscedastic data. Much of the 'art' of datamining stems from the knowledge and experience of the dataminer in recognising what technique is most appropriate to improve the signal to noise ratio (ie the identification of patterns in the data) in a given circumstance. With supervised learning approaches the algorithms try to identify discriminate features in the data that relate to a target variable. (A discriminate feature is one that is positively or negatively correlated to the target variable in a statistically significant manner.) For example clients that are considered at high risk of not responding to a letter or telephone call, because of past recidivism in relation to a matter as revealed by data on past interactions, may be assigned directly to a team for a review rather than to the lower cost work stream of an automated letter. Some data mining algorithms can be conceptually quite simple, such as rule induction and decision trees, through to the more difficult to explain neural network (essentially a black box learned weighting approach), support vector (a machine learning partitioning approach) and random forest approaches (building a large number of decision trees from random samples of the data that then vote on the classification outcome).

So we can personalise our treatment strategies to the client Letter X

Decision Tree of Rules derived from data to assign scores

Letter Y Call

Treatment – Audit Treatment – Review Score 1000 950 900 850 800 750 700 650 600 550 500 450 400 350 300 250 200 150 100 50 0

Decision Tree

Neural Net

Rule Induction

Regression

DM Neural

In fact scores are likely to be done via several models ‘voting’ together – Ensembles.

Descriptive and predictive analytics are a relatively recent field of business intelligence, enabled by the convergence of computing power, machine learning approaches and statistical understanding. Within the ATO we have a core group of ~16 data miners (most with PhD's in quantitative sciences), who use a variety of software tools such as SAS Enterprise Miner and SAS JMP, NCR Teradata Warehouse Miner and more recently open source tools like the excellent 'Rattle' to assist them in their work. I'll provide some examples from the use of Rattle, written by one of our senior data miners to harness a variety of open source statistical packages available in R, to give a feel for some of the advanced analytic techniques now available in easy to use software. Rattle is available from http://rattle.togaware.com/ Page: 20

Date: Thursday, 3 November 2006

Rattle – open source data mining software:

Exploring data relationships with Rattle – to identify possible features for selection modelling

40

60

0 e+00

2 e+05

4 e+05

Cumulative Adjustment

0.0 0

20

40

60

80

100

Rattle 2006-10-04 12:49:20 Stuart

Adjusted All 0 1

0.4

0.4

All 0 1

0.0

Adjusted

0.8

Cumulative Hours Proportion <= x

Rattle 2006-10-04 12:49:20 Stuart

0.8

Rattle 2006-10-04 12:49:20 Stuart

0

40000

Adjusted All 0 1 0

200

400

600

Frequency Rattle 2006-10-04 12:49:21 Stuart

Distribution of Marital Civil Unmarried Divorced Widowed Separated Absent Married

Adjusted All 0 1 0

0.8

Adjusted

0.0

0.4

All 0 1 0

500 1000

200 400 600 800

Frequency Rattle 2006-10-04 12:49:21 Stuart

2000

Rattle 2006-10-04 12:49:20 Stuart

Distribution of Employment Private Consultant PSLocal NA's PSState SelfEmp PSFederal Unemployed

80000 120000

Rattle 2006-10-04 12:49:20 Stuart

Distribution of Education HSgrad College Bachelor Master Vocational Yr11 Yr10 Associate Yr7t8 Professional Yr9 Doctorate Yr12 Yr5t6 Yr1t4 Preschool

Cumulative Deductions Proportion <= x

0.8

All 0 1

0.4

Proportion <= x

80

Adjusted

0.0

0.8 0.4

All 0 1 20

Proportion <= x

Cumulative Income

Adjusted

0.0

Proportion <= x

Cumulative Age

Adjusted All 0 1 0

400

800

1200

Frequency Rattle 2006-10-04 12:49:21 Stuart

Distribution of Occupation Professional Executive Clerical Sales Repair Service Machinist NA's Cleaner Transport Farming Support Protective Home Military

Adjusted All 0 1 0

50

150

250

Frequency Rattle 2006-10-04 12:49:21 Stuart

Some more examples of the analysis methodologies supported in Rattle are contained in an Annex to this paper. Page: 21

Date: Thursday, 3 November 2006

Some graphic examples from some of the analytic modelling and evaluation techniques available in Rattle: Descriptive analytics. Segmentation by decision tree against target variable (adjustment). Decision Tree audit-csv.txt $ Adjusted

Marital

Education

2 0 755 cases 93.8%

Occupation

12 0 322 cases 73.6%

7 1 219 cases 74.4%

Hours < - > 38

26 0 21 cases 71.4%

Age < - > 33.5

54

55

0 1 22 cases 61 cases 63.6% 73.8%

Rattle 2006-10-02 16:23:45 Stuart

K-means clustering – Find k clusters in the data (3 in this instance)

Page: 22

Date: Thursday, 3 November 2006

1121 1488 1220 1414 1007 563 49 1025 205 495 1995 183 733 1226 1537 224 130 1181 573 604 1055 1015 482 457 818 189 1194 1467 728 1255 1810 1586 1892 1711 1783 575 1095 1327 627 1715 1962 988 1726 344 822 1428 1607 784 459 1454 1170 1219 1408 777 230 865 548 232 1489 1091 1361 792 556 991 787 234 418 570 1654 1260 622 1054 1977 386 1147 1517 642 1229 1037 105 292 1852 1837 695 1538 1999 115 1472 1970 1581 1119 602 751 1196 1653 860 164 1708 462 515 1492 219 13 1817 1480 746 1941 1770 569 1744 1018 1003 74 1451 765 879 1269 560 815 1782 1353 619 797 649 69 1762 2000 1364 178 1659 204 917 1539 1655 826 1358 1650 1177 265 895 27 1210 1598 1859 217 1312 1978 193 1017 1683 1225 198 729 53 1777 1024 493 1975 1920 1354 992 1267 315 567 330 177 615 1039 428 400 1253 766 341 1519 1179 679 64 1916 1053 1491 220 149 401 911 162 184 1522 1689 611 1410 1045 1384 45 1143 1976 632 1125 1295 1592 1415 1388 626 1571 1851 350 1577 1789 138 179 1189 361 109 1356 650 696 1634 1839 126 1479 1904 431 1701 1118 1466 1074 1772 1575 176 1827 660 120 1732 390 297 1287 1847 1368 1611 403 883 811 1795 1759 693 451 371 758 1030 1335 216 1719 262 107 967 1693 1685 252 1105 445 522 1826 377 75 1440 1155 119 1887 1779 239 458 1086 1555 708 1032 1066 237 1416 1969 95 1674 320 763 1447 1982 250 537 379 118 275 638 949 31 1960 1094 1516 543 267 163 1953 1117 274 245 1430 1946 781 1661 853 89 1165 546 606 1425 1049 1972 225 631 100 946 1908 831 668 1573 1340 1075 1820 724 52 1111 1766 340 1824 1664 809 678 1063 1919 639 943 789 748 1968 623 266 1934 1008 159 1469 1619 738 578 1251 1130 1860 1679 1128 754 1434 1831 30 1706 502 165 1379 1735 211 783 1280 1518 576 1507 1604 888 1681 1159 354 1288 1709 288 1534 1411 1665 1242 1035 1216 1383 1011 114 210 1402 169 388 1192 1464 531 1332 1853 1380 610 1040 448 24 1811 1266 1221 1657 527 717 1833 1442 735 898 251 923 760 1882 1041 1231 1254 442 368 1944 1930 785 1138 1163 1197 1974 1286 1530 385 1044 1099 374 1387 1696 4 1529 812 66 1895 549 601 1281 1067 1319 595 1594 721 58 1774 1935 753 1729 1673 998 1695 1042 1898 1092 331 775 656 1452 1567 200 1265 1816 1606 455 697 657 1399 1526 214 1241 1441 833 111 1832 634 150 1861 399 106 561 790 993 1149 308 376 1981 411 494 349 983 249 720 1841 1299 392 1167 1872 127 1915 1473 434 1776 1643 1980 536 1670 1205 590 1336 1878 1068 1543 1238 725 1743 505 731 1929 333 922 272 1961 1490 279 976 436 874 1645 591 954 1198 316 863 1323 487 501 709 820 928 529 1487 1313 677 1630 414 478 1207 850 903 1854 1912 1109 14 1838 317 307 1203 1745 684 1139 1535 958 800 393 300 1675 486 347 242 1676 902 664 1576 1026 1973 1557 1423 535 1985 1173 532 516 675 104 33 1631 1463 93 1562 1846 154 1531 813 900 1291 226 306 1120 1819 1876 270 1114 258 62 1458 1888 1802 1378 1788 373 1381 689 51 203 779 550 102 1350 1235 921 1456 1651 700 1395 1050 805 741 871 5 1140 1717 827 481 1084 346 835 1206 1660 584 302 716 701 1848 593 969 1249 1656 933 714 1805 477 299 1843 1270 1712 1787 614 893 1445 1884 999 1809 1945 289 694 659 929 645 1433 1813 280 1903 597 353 1807 1393 1822 1283 605 846 301 1697 727 510 1085 113 726 1166 1498 327 1890 1716 636 1345 1768 1157 1694 1763 565 191 1955 945 864 1240 1556 994 1901 424 78 1436 1382 1880 1211 1154 1247 1881 915 713 332 1184 955 342 1100 600 129 830 456 243 761 1027 970 804 1771 1905 1325 1102 248 305 1071 463 57 1596 1798 241 633 1625 1605 691 1608 637 845 1734 1377 1862 852 19 46 1873 926 950 8 1964 1116 1870 1642 103 971 283 174 941 1142 1171 745 180 704 290 60 654 190 146 1678 1457 474 1967 686 170 1422 1405 1829 834 484 810 698 1992 987 199 1080 876 139 1667 155 273 338 829 972 1070 1252 1073 1740 1913 489 882 963 582 20 1794 496 26 1172 1994 539 1372 814 439 337 1208 819 391 132 521 435 1786 651 920 1462 1222 194 1001 1565 545 335 1579 425 979 1749 1213 444 1107 1747 558 1443 1936 269 1248 1554 752 770 1233 430 261 1409 554 504 1950 774 841 1688 1671 1290 1212 520 461 284 1065 931 875 1644 236 816 1314 277 366 1721 533 842 291 519 309 90 1736 1514 446 1282 966 429 1773 1495 1376 1864 1815 776 1127 187 596 360 1808 1481 498 1305 398 710 1078 1622 962 1351 1836 509 1106 1927 1273 1209 1900 1793 101 1373 1504 937 1276 1906 1624 255 135 65 1886 1284 511 253 1684 367 348 1699 978 229 1959 157 22 1129 1296 762 1257 1893 1303 186 1599 670 518 1021 1731 260 1761 930 56 1909 672 499 1523 1997 608 773 1455 1392 76 1636 1949 181 1855 1800 1126 1168 125 32 1366 1223 1943 924 1204 1775 1753 1885 851 832 1268 1750 12 1966 476 653 1315 1954 421 282 1367 1438 891 1730 1623 63 1144 1868 910 1148 1952 231 1294 579 564 858 1738 1932 823 1156 1544 849 1632 1910 116 1921 1907 87 1680 1349 1009 964 1578 1560 896 1403 328 500 133 192 663 1412 1723 380 156 613 702 47 1993 973 410 160 625 72 1764 1465 389 1614 934 890 1724 1417 1031 182 1339 412 540 1450 840 655 938 453 15 1532 641 158 1060 1511 1460 437 1306 839 541 1365 1398 312 1153 1261 517 122 778 314 1587 148 821 407 1540 1828 936 1250 862 869 121 796 769 1797 1825 706 1237 1176 1406 817 1344 1629 742 508 141 1 806 897 547 1510 173 475 1865 1352 1137 311 1418 594 492 1784 364 304 1506 1835 112 1264 1778 23 1002 240 599 61 756 825 503 1329 238 334 526 764 202 1263 1998 1097 1891 1748 1574 1334 1931 646 956 1615 1437 523 1302 1965 471 1842 682 848 730 1391 1918 838 1359 1939 143 1572 629 426 1600 1638 84 1419 1161 868 452 1801 450 620 108 580 538 1453 1244 247 1548 1330 1648 1134 1232 497 944 1052 592 85 688 916 91 1621 1682 1123 1791 1069 1803 256 404 953 2 1558 1195 1633 1400 1341 1435 1739 965 59 1141 1484 899 1096 1371 185 175 215 801 566 1470 1923 469 1627 166 674 1439 740 3 1988 1338 577 35 1187 94 99 1928 413 9 621 854 961 1158 1951 416 1051 1700 140 468 1692 1821 1637 1503 1728 1034 734 406 17 1058 711 628 1769 1083 524 358 1355 381 722 980 206 919 1502 1182 86 1804 530 432 1496 1897 589 48 1262 144 534 1520 1568 808 1704 1183 843 757 712 38 1874 1703 925 986 1926 1879 583 1275 687 387 1849 1552 131 1088 402 438 1666 1512 1894 571 1863 1370 885 736 705 73 171 383 667 1476 718 542 1963 1272 939 908 1218 551 11 1214 1047 268 1292 1360 737 1006 1710 1404 1322 1609 791 123 802 647 28 1113 1857 948 1278 294 977 1407 1497 134 1477 1000 1564 293 1652 559 195 1413 1348 959 1082 1010 824 1258 1424 743 873

0

50000

150000

250000

Dendogram – visualising the 'closeness' of variables

Variable Correlation Clusters audit-csv.txt

Adjus ted

Age

Hours

Deductions

Income

1.5

Page: 23

1.0 0.5 0.0

Rattle 2006-10-04 12:41:05 Stuart

Hierarchical clustering – top down or bottom up measure of client 'closeness'.

Discriminant Coordinates audit-csv.txt

Rattle 2006-10-04 12:38:05 Stuart

Date: Thursday, 3 November 2006

Predictive analytics. The risk ranking of clients by predictive analytics allows a caseload to be prioritised so that optimal revenueresourcing decisions might be made. In this example we can see that at 40% of the caseload the algorithm is returning 80% of the revenue.

60

Revenue Adjustments Strike Rate

40

Performance (%)

80

100

Risk Chart rf audit-csv.txt [test] Adjustment

0

20

28%

0

20

40

60

80

100

Caseload (%) Rattle 2006-10-02 16:27:39 Stuart

Risk lift charts are a way of 'seeing' the improvement that an analytic model can produce and make the trade-off between caseload and revenue more obvious to management.

Page: 24

Date: Thursday, 3 November 2006

The area under the ROC (receiver operating characteristic – the true positive/false positive ratio as the discriminating threshold changes) curve gives us a view of the relative performance of different selection models.

In this example we can see that in this instance the Support Vector Model (ksvm) slightly outperforms the Random Forest Model (rf) and the Logistic Regression model (glm) and these outperform the single Decision Tree (rpart) which significantly outperforms the simpler Gradient Boost Model (gbm). Where different models perform significantly better at different parts of the distribution, ensemble approaches can be used to harness the right model for the right part of the distribution. Case selection models need to be evaluated from a number of perspectives. Better selection models select more true positives and true negatives and minimise false positives and false negatives. Note that the strike rate metric [true positives/(true positives+false positives)] is like an iceberg - it is what you can see and measure but it is only part of the picture as it does not provide details (except by inference) regarding those not selected (false negatives and true negatives). Consider the following Taylor-Russell diagrams: Relatively poor selection model:

Page: 25

Better selection model:

Date: Thursday, 3 November 2006

Optimisation - Case mix Once we have our treatments and case selection optimised a final factor for consideration is how the overall case mix impacts upon revenue collections and voluntary compliance. It would generally be serendipitous if an organisations case mix optimised its revenue yield. Analytic simulation modelling techniques such as linear programming can provide insights to management regarding the optimum mix of case types and their revenue and resource impacts. Such modelling can provide insights into key parameters and their sensitivity to change. A simple two case type example will be used to demonstrate how linear programming can be utilised in the optimisation of case mix by a revenue authority. Please read the following in conjunction with the graphs on the following two pages… Assume that there are two case types that staff can work on – an Income Tax Case Audit and a GST Audit. For this example let us say that an Income Tax Audit on average returned $500 in revenue and a GST Audit returned $200. Assume that: we have three levels of staff involved in the casework – EL2's, APS6's and APS4's and that we have five EL2's, four APS6's and four APS4's all of whom can work for 40 hours per week. An Income Tax Audit takes on average two hours of an EL2's time plus two hours of an APS6 and one hour of an APS4. A GST Audit takes on average an hour of an EL2's time plus an hour of an APS4. In this example we will say an EL2 costs $50 per hour, an APS6 costs 20$ per hour and an APS 4 costs $15 per hour. The linear optimisation question is what case mix optimises net revenues? We can see that 2 Income Tax Audits = 5 GST Audits in terms of revenue. This is the slope of our revenue curve. We can further see that we have five constraints: o The number of Income Tax Audits must be greater than or equal to zero (cannot have negative case numbers!) o The number of GST Audits must also be greater than or equal to zero. o The EL2 effort on cases must be less than available EL2 effort hours (5 x 40 = 200) o The APS6 effort on cases must be less than available APS6 effort hours (4 x 40 = 160) o The APS4 effort on cases must be less than available APS4 effort hours (4 x 40 = 160) We can see that: o The EL2 constraint translates into a maximum of 100 Income Tax Audits (@ 2 hrs each) or a maximum of 200 GST Audits (@ 1 hr each) o The APS6 constraint translates into a maximum of 80 Income Tax Audits (@ 2 hrs each); and o The APS4 constraint translates into a maximum of 160 Income Tax Audits (@ 1 hrs each) or 160 GST Audits (@ 1 hr each). Graphically this produces a feasibility within which the actual case mix must sit. At the boundary of the feasibility space an optimum case mix exists. When we place our revenue curve onto the diagram and move it to its outer most point on the feasibility space we identify our optimum case mix: 80 Income Tax Audits and 40 GST Audits. Our 'binding' constraints are the number of effort hours available of our EL2's and APS4's. We can see at the optimum point we have spare APS6 capacity. If we introduce a coverage constraint (must do at least 60 GST Audits) we can see that it reduces the feasibility space and creates a new optimum point. The binding constraints at this point are the EL2 effort time and the GST coverage constraint. We have spare APS6 and APS4 time. This was of course a very simple example, however the methodology holds for greater numbers of case types and resource types – it just can’t be shown graphically. (An example spreadsheet for more case types is included in annex that uses the Excel ‘solver’ add-in.) This type of analysis can be further enhanced by the modelling distributions rather than fixed amounts. (eg the distribution of revenue, the distribution of effort time etc by case type. Software such as @Risk can be used for this purpose. (See @Risk at http://www.palisade.com)

Page: 26

Date: Thursday, 3 November 2006

Page: 27

Date: Thursday, 3 November 2006

Page: 28

Date: Thursday, 3 November 2006

Other matters Prioritising analytic work There are a wide range of uses that competent analytic capabilities can be used on within a revenue authority – it is an ‘in demand’ skill and experience set It is relatively easy for such staff to be deployed on research type projects to discover some interesting insight about our client base. However this may not be the best method of prioritising tasks for the analytic capability. When looking at where to apply analytic techniques to optimise revenue, a useful approach may be to consider either revenue gains or staff savings by the relative strike rate being achieved. (Apriori, an area with a low strike rate has the largest potential to improve. By looking at the revenue being collected or the staff utilised by 1-Sr, a rectangle is produced whose relative size indicates a potential room for improvement.) In this example as large business already have a relatively high strike rate and fewer staff there is smaller room for efficiency gains. In this hypothetical example we can see that from an efficiency viewpoint the cash economy has a relatively higher number of active compliance staff and a lower strike rate than large business – hence analytic enhancements to the strike rate would provide a greater efficiency gain. From a revenue perspective it is less clear cut.

FTE Cash Economy

Large Business

1-Sr Efficiency – Staff Focus $ Large Business

Cash Economy

1-Sr Effectiveness – Revenue Focus

Page: 29

Date: Thursday, 3 November 2006

Measuring strikes Another factor in the optimisation equation is getting a consistent view across the organisation of what counts as a ‘strike’ or successful selection. Where iterative refinement processes are used it becomes questionable as to what is a true false positive when candidate cases are filtered out without a full review. A richer picture emerges when one looks at the selection process end to end, taking into account the various review and refinement approaches (and resources) used to ultimately allocate a case. A wider variety of metrics could be produced such as: o

The selection coverage rate (% of clients in the population run through the selection algorithm). Output is a candidate case.

o

The review coverage rate (candidate cases reviewed by an experienced officer prior to allocation).

o

The data quality error rate (candidate cases not proceeded with because on review the data is incorrect).

o

The selection error rate (candidate cases not allocated because on review they were not a strike).

o

The audit/prosecution coverage rate (cases allocated to compliance staff).

o

The strike rate (successful cases from those allocated).

We should note that the outcomes of the process are determined not only by the effectiveness of the case selection process but also by the effectiveness of the treatment applied and the capability of the staff providing the treatment. The case selection process can influence the effectiveness of the treatment by: o

selecting appropriate treatments based on attributes of the taxpayer (and historic effectiveness),

o

narrowing the risks that need to be investigated (or highlighted in the case of passive treatments such as letters); and

o

supplying contextual information to direct the case worker to the likely source of non-compliance.

The effect of varying strike rates As analytic methods improve the strike rate for a particular casetype a revenue authority faces the question of whether to retain the staff on the casetype or to move some of them to other case types. The change on resourcing requirements as strike rates change is dependent upon the relative trade-off between the resources required for a productive case and those required for a non-productive case. As non-productive cases typically are less resource intensive, it will not be a simple matter of say 10 fewer non productive cases means that 10 more productive cases can be completed. Revenue projections made on the basis of a 1 – 1 trade-off will inevitably be wrong in the majority of circumstances. Revenue gains and/or staff saving can be derived from strike rate improvements. The impact of varying strike rates upon the revenue authority is largely dependent on whether they are guided by revenue targeting or staffing constraints.

Fixed staffing scenario If staffing is fixed then the impact of increasing strike rates can generate additional revenue for the authority – an effectiveness gain. (The same number of staff can do more productive cases.) This is not a straight trade-off of a non productive case for a productive case as often the effort time required for a productive case will be significantly more than the effort to complete a non productive case.

Page: 30

Date: Thursday, 3 November 2006

It follows that an improvement in the strike rate from 10% to 15% (a 50% improvement) does not have the same revenue impact as a change from 90% to 95% (a 5.6% improvement). At higher strike rates the same percentage strike rate change results in fewer additional productive cases than it does at lower strike rates.

Number of Cases Completed

Productive to Non Productive Caseload by Strike Rate with a 3:1 Effort Time Differential - Fixed Staffing 30% Base Strike Rate 2000

1500

1000

500

15 %

25 %

35 %

45 %

55 %

65 %

75 %

85 %

95 %

Sr

0

Strike Rate Productive Cases

Non Productive

The chart above plots the effects of a 3:1 effort time differential as strike rates change with fixed staffing on case numbers. With fixed staffing as the strike rate varies downwards the number of cases able to be completed grows nonlinearly in accordance with the effort time differential between a productive and non productive case and the difference in strike rate: Srx/(1+[tPC/tNPC-1]*Srx) If all cases are productive the number of cases able to be completed = Number of Staff (nS)/Productive Case Effort Time (tPC). If all cases are non productive then the maximum number of cases able to be completed (TCmax) = Number of Staff (nS) /Non Productive Case Effort Time (tNPC). For a particular strike rate (Sr) the number of productive cases (nPC) is given by: nPC = TCmax*Sr/(1+[tPC/tNPC-1]*Sr) and the number of non productive cases (nNPC) is given by: nNPC=TCmax-(tPC/tNPC)/nPC The revenue gain from a change in strike rates is given by: ∆R$ = nPC2*r$PC-nPC1*r$PC where nPCx = TCmax*Srx/(1+[tPC/tNPC-1]*Srx) (ie the change in number of productive cases times the revenue per productive case – assuming that revenue per productive case does not change significantly as case numbers or the strike rate changes.)

Page: 31

Date: Thursday, 3 November 2006

Fixed revenue target scenario If the revenue target is considered fixed then the impact of increasing strike rates can generate staff efficiency savings for the revenue authority. (Fewer staff are needed for the same number of productive cases.)

Number of Cases Completed

Productive to Non Productive Caseload by Strike Rate with a 3:1 Effort Time Differential - Fixed Revenue 30% Base Strike Rate 6000 5000 4000 3000 2000 1000 0 Sr

95% 85% 75% 65% 55% 45% 35% 25% 15% Strike Rate Productive Cases

Non Productive Cases

The chart above plots the effects of a 3:1 effort time differential as strike rate changes on case numbers with a fixed revenue target and variable staffing. With fixed revenue targets as the strike rate varies downwards the number of cases required to meet the revenue target grows non-linearly as we must pump more cases through to maintain the revenue amount. The number of productive cases required for the revenue is fixed irrespective of the strike rate. It is the total number of cases needed to be completed, and hence the staffing, that is varied to maintain the revenue target. In this situation if all cases are productive, the number of cases completed at this strike rate equals the revenue target (R$) divided by the average productive case result ($rPC). (eg So a $3 million revenue target at an average of $10,000 per case requires the completion of 300 cases.) The number of productive cases is fixed at this level irrespective of the strike rate so: nPC = R$/$rPC For a particular strike rate (Sr) the number of non productive cases (nNPC) is given by: nNPC=(nPC-Sr*nPC)/Sr and the number of staff need is given by: nS=nPC*tPC+nNPC*tNPC (ie the time spent on productive cases plus the time spent on non productive cases)

The staff saving from a change in strike rates is given by: ∆nS = nNPC2*tNPC- nNPC1*tNPC where nNPCx=(nPC-Srx*nPC)/Srx\ (ie the change in number of non productive cases times the effort used on those cases - assuming that effort time per non productive case does not change significantly as case numbers or strike rates change)

Page: 32

Date: Thursday, 3 November 2006

Traditional versus analytic driven case selection – an example Traditional Case Selection

Analytic case selection

Subject matter experts use their experience to create case selection rules to filter non-compliant clients out from the broader population in respect of particular risks. OLAP type approaches – slice n’ dice.

Past cases of non-compliance are divided into ‘successful’ cases (where relevant non-compliance was found) and ‘unsuccessful’ cases (where it wasn’t). The data set is further divided into a ‘training set’ (used to build the analytic case selection model) and a ‘validation set’ to test that the model works.

They focus on client features that they believe assist in revealing whether a client is compliant or not. These rules are often refined overtime to enhance the strike rate. The rules produced are generally subjectively weighted to derive a client risk scores for work prioritisation. Eg (hypothetical example) o If client has WRE claim > $1,500 and Uniform > o o o o

$500 then risk score = 4 If client has motor vehicle claim > $5,000 then risk score = 2 If client has self education claim > $2,000 then risk score = 3 Add client risk scores to produce total risk score Select for review clients with total risk score > 8

Working with subject matter experts the analytics modeller identifies client features that appear to be associated with non-compliance. These are tested statistically to see if they are associated with noncompliance. The training data set is then essentially regressed against the target set of successful cases to produce a risk scoring algorithm that optimises the probability of predicting a successful case from the data set. The algorithm is then tested against the validation data set to see how it performs. The rules produced are weighted by the algorithm to produce a client risk score for work prioritisation:

Rules produced this way (top down) have an advantage of being ‘known’ to a subject matter expert and are more explainable – but they are generally not optimal.

Rules produced this way (bottom up) sometimes surprise subject matter experts and may require effort to understand and explain. (Though generally a simple decision tree can be retro fitted to provide explanatory power to the output of complex model that produces a higher strike rate.) The rules produced from such data driven approaches will usually outperform rules derived from subjective subject matter view on discriminate features and their weighting.

Page: 33

Date: Thursday, 3 November 2006

Optimising capability The analytic methods outlined in this paper are, of course, only part of the picture and will not greatly assist in optimising compliance if the capabilities behind the treatment strategies are also not operating optimally. If compliance staff skills, knowledge and experience are not up to the task then the treatment will not be optimal. Staff cannot be overzealous, incompetent or corrupt in the performance of this work and continuous improvement/quality assurance measures need to be in place to ensure that client treatment is optimally delivered. For example the use of: o Performance standard & exception monitoring o

Peer/team leader quality reviews

o

Periodic performance reviews & client perception surveys

o

Dual sign-off procedures (eg case officer-team leader)

o

Escalation (& de-escalation) procedures

o

‘Case call-over’ procedures so that all significant/old cases are reviewed by someone outside the team

o

Back-log procedures that are triggered when the median age of the case pool reaches a particular threshold above the standard.

Conclusions The development of risk and intelligence approaches is reaching a new level of sophistication. The convergence of computing power, machine learning approaches and statistical algorithms is providing revenue authorities with the capability to be much more personalised in the targeting of treatments to clients. The successful integration of these techniques with robust management approaches to the use of strategic intelligence is enabling revenue authorities to be much more purposeful in how they treat the risks that challenge them. Harnessing this opportunity requires us to make the transition to a more quantitative evidence based approach to management. The challenge is to make it happen. After all “the best way to predict the future is to invent it.”

9

Stuart Hamilton November 2006

9

Alan Kay. 2003 Turing Prize winner. Inventor of many of the computer GUI interface aspects we take for granted today. Page: 34

Date: Thursday, 3 November 2006

Annexes Scoring using ∆Tax Let us look at how a clients risk score might be computed using the ∆Tax concept: Clients A, B & C have paid the following tax over the last three years:Client A Y3: 3,000, Y2: 3,000 & Y1: 3,000

Client B Y3: 3,000 Y2: 3,000 Y1: 3,000

Client C Y3: 3,000 Y2: 3,000 Y1: 0

A predictive algorithm estimates that the following tax may have been avoided or evaded (ie at risk):Client A Y3: 2,700, Client B Y3: 2,700 Client C Y3: 0 Y2: 2,700 & Y2: 0 Y2: 2,700 Y1: 2,700 Y1: 0 Y1: 2,700 Values & Scores – Multi Year (Y3 -> Y1) Revenue average = (8,100+2,700+5,400)/3 = 5,400, Severity average = (0.474+0.231+0.474)/3 = 0.393 Revenue Severity

8,100 [1.50] 0.474 [1.21]

Client B 2,700 [0.50] 0.231 [0.59]

Client C 5,400 [1.00] 0.474 [1.21]

where 1.5 = 8,100/5,400 & 0.474 = 8,100/(8,100+9000) & 1.21 = 0.474/0.393 for Client A A single weighted 90/10 Revenue/Severity Multi Year Risk Score then gives: Client A 1.47 [0.9*1.50+0.1*1.21] Client B 0.51 Client C 1.02 These can be transformed further into a score from 1.000 to 0.000 by setting the highest score equal to 1.000 and scaling the other scores off this: Client A 1.000 [1.47/1.47]

Client B 0.347 [0.51/1.47]

Client C 0.694 [1.02/1.47]

High score  high risk so we select Client A, then Client C, then Client B… Note this example did not use PV, weighting or discounting for clarity

Page: 35

Date: Thursday, 3 November 2006

Some additional Rattle analysis output It is important that our understanding of our clients move beyond simplistic data analysis approaches based on summary descriptive statistics such as means, medians, modes and standard deviations. For example; in a positively skewed distribution, as most are in tax: o A mean estimate of a client’s income will overstate income more times than it understates it. o A median estimate will overstate and understate it with equal frequency. o A mode will understate income more times than it overstates it. In all of these cases a single point estimate based on the mean, median or mode will be significantly more incorrect than an estimate produced by modelling the income using discriminatory variables. It is the difference between trying to sum up a song with a single note versus having an MP3 version of it (compressed to the sounds that matter and so still understandable). We need to reach back into the distributions to understand the variance between client groups and find the variables that matter (to answer the problem we are looking at. With the computing power and algorithms now available we should be striving for the highest practical degree of accuracy in our estimates. Anything less is almost bordering on professional negligence for an analyst.

0

1

2500

Distribution of Deductions

1000 0

0 e+00

20 40 60 80

All

All

0

1

All

0

1

Adjusted Rattle 2006-10-04 12:48:06 Stuart

Distribution of Hours

Distribution of Adjustment

Distribution of Employme nt

0

1

1500

Adjusted All 0 1

0 500

0

0

40

60000 140000

Adjusted Rattle 2006-10-04 12:48:06 Stuart

80

Adjusted Rattle 2006-10-04 12:48:06 Stuart

All

All

0

1

Private

NA's

Unemployed

Adjusted Rattle 2006-10-04 12:48:07 Stuart

Adjusted Rattle 2006-10-04 12:48:07 Stuart

Rattle 2006-10-04 12:48:07 Stuart

Distribution of Educa tion

Distribution of Marital

Distribution of Occupa tion

HSgrad Yr10

Yr12

Rattle 2006-10-04 12:48:07 Stuar

250

Adjusted All 0 1

0

All 0 1

100

Adjusted

400 800

All 0 1

0

0 200

500

Adjusted

Page: 36

Distribution of Income 4 e+05

Distribution of Age

Civil

Widow ed

Rattle 2006-10-04 12:48:07 Stuar

Prof essional NA's

Home

Date: Thursday, 3 November 2006

Rattle 2006-10-04 12:48:07 Stuar

Spreadsheets used in this paper Linear programming

Case Type OptimisationV5.xls (...

Impact of strike rate changes on case numbers

case_work_calcs_v 2.xls (46 KB)...

@Risk from Palisade can be used with these spreadsheets to provide distributional input rather than fixed values. This enables simulations such as linear programming to be more realistic in their projections.

See http://www.palisade.com/downloads/pdf/Palisade_RISK_0604PA.pdf Open source data mining software: Rattle (r based) http://rattle.togaware.com/ Weka (java based) http://www.cs.waikato.ac.nz/ml/weka/ Yale (java based) http://rapid-i.com/content/blogcategory/10/21/lang,en/ (connects to Weka) Knime (eclipse/java based) http://www.knime.org/ (connects to Weka)

Page: 37

Date: Thursday, 3 November 2006

Summary of some analytic methodologies that might be used in optimisation

Understand the customer

Determine right experience

Deliver the right experience

Research:

Delivery:

> Intelligence (Qualitative) Strategic (What to look at) Operational (Who to look at) Tactical (What is needed to complete case) + > Analytics (Quantitative) Descriptive analytics

Have decided not to comply Don’t want to comply

Create pressure down

Channel management + Case Management (mult-activities) + Work Management (single activities)

Use the full force of the law

Investigate / prosecute

Deter by detection

Audit / Penalise Review / Advise

Try to, but don’t always succeed

Attitude to Compliance

Willing to do the right thing

Optimise understanding of client > Descriptive analytics >> Distibution – mean, median, mode, kurtosis, skewness > Exploratory data analysis – understand the data >> Cumulative distributions >> Analysis of variance >> Box whisker – Interquartile range & outliers >> Missing values – data cleaning >> Transformations / Normalisation >> Principle components >> Benfords analysis >> Dendograms >> Time series analysis > Clustering – how do clients naturally group in the data >> k-means >> Hierarchical >> Self organising feature maps



Assist to comply

Compliance Strategy

Educate / Market

Make it easy

Optimise treatments > Champion / challenger approach > Control groups

System changes Simplification Prepopulation





> > > >

Optimise treatment selection > Predictive analytics – risk score modelling >> Decision tree >> Random forest >> Logistic regression >> Support vector machine >> Neural networks



Optimise delivery Simulation modelling – decision support Queuing methodologies Linear programming Sensitivity modelling of parameters

Another view of the end to end process from risk identification to case outcomes:

Page: 38

Date: Thursday, 3 November 2006

Taxpayers

Optimise treatment & candidate selection Risk Identification

Risk Treatment Development

Model & Treatment Strategy

Candidate Population

Modelling

Ranked Candidates

Coverage & Revenue targets

Operationalise Analytics Cases

Risk Prioritisation

Resource Allocation

Demand Management

Seibel Work & Case Mgmt Results

Optimise risk priority & case mix selection

Page: 39

Date: Thursday, 3 November 2006

Related Documents


More Documents from ""