Is cheese some kind of substitute for bread, so that one does not need any bread if one has cheese? The target type, which can be selected via the option -t , is either frequent item sets default, option -ts , closed item sets option -tc , maximal item sets option -tm , generators also called free item sets, option -tg or association rules option -tr. But it is much steeper for a small prior confidence than for a large one and it is non-linear. In order to find association hyperedges, choose rule confidence as the additional evaluation measure option -ec and averaging as the aggregation mode option -aa , see this section for more explanations. Likewise, the relative support of S is the fraction or percentage of the transactions in T which contain S.
This nicely models the statistical significance of confidence changes: If we search for association rules, we do not want just any association rules, but "good" association rules. Note that this is a safe pruning rule, because no superset of an infrequent item set can be frequent.
Still another possibility consists in using a standard set of unique transactions and assigning to each of them an occurrence counter. This number becomes even larger if one allows for multiple items in the consequent. Of course, these latter two rules together do not say the same as the more complex rule; they do contain additional information.
With the option -o the original rule support definition can be selected. This is explained in more detail below. The aggregated value is the evaluation of the item set, and item sets can now be filtered by requiring a minimum value for this aggregation with the option -d.
That is, it is counted for the total number of transactions, but does not count for the support of any item set other than the empty set. Potentially interesting rules differ significantly in their confidence from the confidence of rules with the same consequent, but a simpler antecedent. That is, in CSV-format, the above input file would look like this file test1.
This can be useful, for example, if one wants to restrict the analysis to a subset of all items. These restrictions are covered by the target type. With the option -p the additional item set evaluation measure can also be used for pruning the search additionally.
If the rule is applicable, it says that the customer can be expected to buy cheese. For my Apriori program this is the default operation mode since version 4. Of course, we do not want just any association rules, we want "good" rules, rules that are "expressive" and "reliable".
Information is measured as a reduction of entropy. Note that the argument to this option is interpreted as a percentage if it is positive, but if it is negative, it is interpreted as an absolute number number of transactions rather than a percentage. This can be fixed by declaring the comma a blank character option -b.