CSE3212 Data mining - Semester 2 , 2007

Unit leader :

Maria Indrawan

Lecturer(s) :

Caulfield

  • Bala Srinivasan
  • Campbell Wilson

Tutors(s) :

Caulfield

  • Samar Zutshi
  • Manoj Kathpalia

Introduction

Welcome to CSE3212 Data Mining for semester 2, 2007. This 6 point unit is an elective in the Bachelor of Computing and Master of Applied Information Technology (MAIT). The unit has been designed to provide you with an understanding of data mining techniques and their roles in solving organisation issues.

Unit synopsis

This subject provides an overview of the techniques used to search for knowledge within a data set using both supervised and unsupervised learning. The techniques include Classification, Prediction, Clustering, Association Rules, Decision Trees, and Neural networks. Students will be able to choose an appropriate technique to suit a particular situation

Learning outcomes

To develop student knowledge of the techniques and methods for data exploration in large databases, both those currently being used and those which are presently being researched. For students to become familiar with the currently available techniques for the extraction of information from large databases. At the completion of study the students will:
  • have an understanding of the purpose of data mining
  • have an understanding of the major techniques for data mining
  • have developed the skill to choose an appropriate technique for a particular situation.
  • have developed the knowledge to allow them to apply a process to the acquisition of knowledge from a data store.
  • have the skills to use a number of implementations of data mining software.

Workload

Students will be expected to commit the following hours a week:

  • two-hour lecture.
  • two-hour tutorial.
  • an average of 4-8 hours of personal study to do the reading and assignment works.

Unit relationships

Prerequisites

FIT1004, CSE2132 or equivalent

Relationships

MAIT students who have completed CSE5230 Data Mining is prohibited in enrolling to this unit.

Continuous improvement

Monash is committed to ‘Excellence in education' and strives for the highest possible quality in teaching and learning. To monitor how successful we are in providing quality teaching and learning Monash regularly seeks feedback from students, employers and staff. Two of the formal ways that you are invited to provide feedback are through Unit Evaluations and through Monquest Teaching Evaluations.

One of the key formal ways students have to provide feedback is through Unit Evaluation Surveys. It is Monash policy for every unit offered to be evaluated each year. Students are strongly encouraged to complete the surveys as they are an important avenue for students to "have their say". The feedback is anonymous and provides the Faculty with evidence of aspects that students are satisfied and areas for improvement.

Student Evaluations

The Faculty of IT administers the Unit Evaluation surveys online through the my.monash portal, although for some smaller classes there may be alternative evaluations conducted in class.

If you wish to view how previous students rated this unit, please go to http://www.monash.edu.au/unit-evaluation-reports/

Over the past few years the Faculty of Information Technology has made a number of improvements to its courses as a result of unit evaluation feedback. Some of these include systematic analysis and planning of unit improvements, and consistent assignment return guidelines.

Monquest Teaching Evaluation surveys may be used by some of your academic staff this semester. They are administered by the Centre for Higher Education Quality (CHEQ) and may be completed in class with a facilitator or on-line through the my.monash portal. The data provided to lecturers is completely anonymous. Monquest surveys provide academic staff with evidence of the effectiveness of their teaching and identify areas for improvement. Individual Monquest reports are confidential, however, you can see the summary results of Monquest evaluations for 2006 at http://www.adm.monash.edu.au/cheq/evaluations/monquest/profiles/index.html

Unit staff - contact details

Unit leader

Dr Maria Indrawan
Senior Lecturer
Phone +61 3 990 31916
Fax +61 3 990 31077

Contact hours : Tuesday 2-4 PM

Lecturer(s) :

Professor Balasubramaniam Srinivasan
Professor, and Head of School
Phone +61 3 990 31333 +61 3 990 55222
Fax +61 3 990 55157
Dr Campbell Wilson
Lecturer
Phone +61 3 990 31142
Fax +61 3 990 31077

Tutor(s) :

Mr Manoj Kathpalia
Mr Samar Zutshi

Teaching and learning method

Tutorial allocation

There are two tutorial classes scheduled for this unit. Students may choose to attend either of the scheduled classes. There is no need to register to specific tutorial class.

Communication, participation and feedback

Monash aims to provide a learning environment in which students receive a range of ongoing feedback throughout their studies. You will receive feedback on your work and progress in this unit. This may take the form of group feedback, individual feedback, peer feedback, self-comparison, verbal and written feedback, discussions (on line and in class) as well as more formal feedback related to assignment marks and grades. You are encouraged to draw on a variety of feedback to enhance your learning.

It is essential that you take action immediately if you realise that you have a problem that is affecting your study. Semesters are short, so we can help you best if you let us know as soon as problems arise. Regardless of whether the problem is related directly to your progress in the unit, if it is likely to interfere with your progress you should discuss it with your lecturer or a Community Service counsellor as soon as possible.

In this unit, students are encourage to use the following model of communications:

  • Unit administration related questions to be directed to the unit leader, maria.indrawan@infotech.monash.edu.au.
  • Unit contents related questions (eg lecture content, tutorial questions, assignment questions) to be posted in the discussion group in MUSO.

Unit Schedule

Week Topic Key dates
1 Introduction  
2 Overview of data mining approaches  
3 Data preparation in data mining  
4 Mining associations in large databases  
5 Classification + Clustering, Part 1.  
6 Classification + Clustering, Part 2  
7 Classification + Clustering, Part 3. 2007/08/27 Assignment 1 Due
8 Unit Test 2007/09/03 Mid semester test
9 Neural Networks I  
10 Neural Networks II  
Mid semester break
11 Visualisation in Data Mining 2007/10/01 Assignment 2 Due
12 Web mining  
13 Revision  

Unit Resources

Prescribed text(s) and readings

No prescribed text book.

Recommended text(s) and readings

  • Data Mining Tutorial Based Primer, Roiger R & Geatz M
  • Data Mining: Concepts and Techniques, Han J & Kamber M

Required software and/or hardware

WEKA, http://www.cs.waikato.ac.nz/ml/weka/

Study resources

Study resources we will provide for your study are:

  • Weekly lecture notes;
  • Weekly tutorial exercises;
  • Assignment specifications and sample solutions;
  • A sample examination;
  • Discussion groups;
  • This Unit Guide outlining the administrative information for the unit;
  • The unit web site on MUSO, where resources outlined above will be made available.

Library access

The Monash University Library site contains details about borrowing rights and catalogue searching. To learn more about the library and the various resources available, please go to http://www.lib.monash.edu.au.  Be sure to obtain a copy of the Library Guide, and if necessary, the instructions for remote access from the library website.

Monash University Studies Online (MUSO)

All unit and lecture materials are available through the MUSO (Monash University Studies Online) site. You can access this site by going to:

  1. a) http://muso.monash.edu.au or
  2. b) via the portal (http://my.monash.edu.au).

Click on the My Units tab, then the Monash University Studies Online hyperlink

In order for your MUSO unit(s) to function correctly, certain programs may need to be installed such as Java version 1.4.2. This can easily be done by going to http://www.monash.edu.au/muso/support/students/downloadables-student.html to update the relevant software.

You can contact the MUSO helpdesk by: Phone: (+61 3) 9903-1268 or 9903-2764

Operational hours (Monday - Thursday) - local time

Australia: 8 am to 10 pm (8pm Non Teaching period)

Malaysia: 6 am to 8 pm (6 pm Non Teaching period)

South Africa: 11pm to 1pm (11 am Non Teaching period)

Operational hours (Friday) - local time

Australia: 8 am to 8 pm

Malaysia: 6 am to 6 pm

South Africa: 11pm to 11 am

Operational hours (Saturday-Sunday) - local time (Teaching and Exam Period Only)

Australia: 1 pm to 5 pm

Malaysia: 11 am to 3 pm

South Africa: 4 am to 8 am

Further information can be obtained from the MUSO support site: http://www.monash.edu.au/muso/support/index.html

Assessment

Unit assessment policy

The unit as assessed with two assignments, a mid-term test and two hours closed book examination. To pass teh unit you must:

  • achieve no less than 40%of the total marks in the exam.
  • achieve no less than 40% of the total marks available on the assignments and unit test.
  • achieve no less than 50% of the total marks available on the unit.

Assignment tasks

  • Assignment Task
    Title :
    Association Rules Mining
    Description :
    Students are expected to perform association rules mining based on a given data set.
    Weighting :
    15%
    Criteria for assessment :
    • Understanding of the motivation behind association rules mining.
    • Understanding of the working of aprori algorithm.
    • Corretness of the submitted solution.
    Due date :
    Week 7, Monday 27th August 2007, 6 PM.
    Remarks ( optional - leave blank for none ) :
    Assignments are due at the start of the lecture in week 7.
  • Assignment Task
    Title :
    Using WEKA for data mining
    Description :
    Students are required to use WEKA to perform data mining of a given data set.
    Weighting :
    15%
    Criteria for assessment :
    • Ability to use WEKA as a data mining tool.
    • Ability to choose appropriate data mining techniques for a given dataset.
    • Ability to discuss and write a report on the interpertation of the data mining results.
    Due date :
    Week 11, Monday 1st October 2007, 6 PM.
    Remarks ( optional - leave blank for none ) :
    Assignment is due at the start of the lecture.
  • Assignment Task
    Title :
    Mid semester test
    Description :
    Mid-semester test. Multiple choice questions.
    Weighting :
    10%
    Criteria for assessment :
    Correct selection of answers in multiple choice questions.
    Due date :
    Week 8, Test will be conducted in the lecture.

Examinations

  • Examination
    Weighting :
    60%
    Length :
    2 hours
    Type ( open/closed book ) :
    closed book

Assignment submission

All assignments to be submitted on the due date at the start of the lecture.

Assignment coversheets

Assignment coversheet for individual assignment is to be attached with all assignment submissions.

University and Faculty policy on assessment

Due dates and extensions

The due dates for the submission of assignments are given in the previous section. Please make every effort to submit work by the due dates. It is your responsibility to structure your study program around assignment deadlines, family, work and other commitments. Factors such as normal work pressures, vacations, etc. are seldom regarded as appropriate reasons for granting extensions. Students are advised to NOT assume that granting of an extension is a matter of course.

Requests for extensions must be made to the unit lecturer at your campus at least two days before the due date. You will be asked to forward original medical certificates in cases of illness, and may be asked to provide other forms of documentation where necessary. A copy of the email or other written communication of an extension must be attached to the assignment submission.

Late assignment

Requests for extensions must be made to the unit lecturer at your campus at least two days before the due date. You will be asked to forward original medical certificates in cases of illness, and may be asked to provide other forms of documentation where necessary. A copy of the email or other written communication of an extension must be attached to the assignment submission.

Return dates

Students can expect assignments to be returned within two weeks of the submission date or after receipt, whichever is later.

Assessment for the unit as a whole is in accordance with the provisions of the Monash University Education Policy at: http://www.adm.monash.edu.au/unisec/academicpolicies/policy/assessment.html

Plagiarism, cheating and collusion

Plagiarism and cheating are regarded as very serious offences. In cases where cheating  has been confirmed, students have been severely penalised, from losing all marks for an assignment, to facing disciplinary action at the Faculty level. While we would wish that all our students adhere to sound ethical conduct and honesty, I will ask you to acquaint yourself with Student Rights and Responsibilities (http://www.infotech.monash.edu.au/about/committees-groups/facboard/policies/studrights.html) and the Faculty regulations that apply to students detected cheating as these will be applied in all detected cases.

In this University, cheating means seeking to obtain an unfair advantage in any examination or any other written or practical work to be submitted or completed by a student for assessment. It includes the use, or attempted use, of any means to gain an unfair advantage for any assessable work in the unit, where the means is contrary to the instructions for such work. 

When you submit an individual assessment item, such as a program, a report, an essay, assignment or other piece of work, under your name you are understood to be stating that this is your own work. If a submission is identical with, or similar to, someone else's work, an assumption of cheating may arise. If you are planning on working with another student, it is acceptable to undertake research together, and discuss problems, but it is not acceptable to jointly develop or share solutions unless this is specified by your lecturer. 

Intentionally providing students with your solutions to assignments is classified as "assisting to cheat" and students who do this may be subject to disciplinary action. You should take reasonable care that your solution is not accidentally or deliberately obtained by other students. For example, do not leave copies of your work in progress on the hard drives of shared computers, and do not show your work to other students. If you believe this may have happened, please be sure to contact your lecturer as soon as possible.

Cheating also includes taking into an examination any material contrary to the regulations, including any bilingual dictionary, whether or not with the intention of using it to obtain an advantage.

Plagiarism involves the false representation of another person's ideas, or findings, as your own by either copying material or paraphrasing without citing sources. It is both professional and ethical to reference clearly the ideas and information that you have used from another writer. If the source is not identified, then you have plagiarised work of the other author. Plagiarism is a form of dishonesty that is insulting to the reader and grossly unfair to your student colleagues.

Register of counselling about plagiarism

The university requires faculties to keep a simple and confidential register to record counselling to students about plagiarism (e.g. warnings). The register is accessible to Associate Deans Teaching (or nominees) and, where requested, students concerned have access to their own details in the register. The register is to serve as a record of counselling about the nature of plagiarism, not as a record of allegations; and no provision of appeals in relation to the register is necessary or applicable.

Non-discriminatory language

The Faculty of Information Technology is committed to the use of non-discriminatory language in all forms of communication. Discriminatory language is that which refers in abusive terms to gender, race, age, sexual orientation, citizenship or nationality, ethnic or language background, physical or mental ability, or political or religious views, or which stereotypes groups in an adverse manner. This is not meant to preclude or inhibit legitimate academic debate on any issue; however, the language used in such debate should be non-discriminatory and sensitive to these matters. It is important to avoid the use of discriminatory language in your communications and written work. The most common form of discriminatory language in academic work tends to be in the area of gender inclusiveness. You are, therefore, requested to check for this and to ensure your work and communications are non-discriminatory in all respects.

Students with disabilities

Students with disabilities that may disadvantage them in assessment should seek advice from one of the following before completing assessment tasks and examinations:

Deferred assessment and special consideration

Deferred assessment (not to be confused with an extension for submission of an assignment) may be granted in cases of extenuating personal circumstances such as serious personal illness or bereavement. Special consideration in the awarding of grades is also possible in some circumstances. Information and forms for Special Consideration and deferred assessment applications are available at http://www.monash.edu.au/exams/special-consideration.html. Contact the Faculty's Student Services staff at your campus for further information and advice.

Last updated : 5 Jul 2007